Part Number Hot Search : 
MUR1610D SMZ62 L2040 47M10 1H220 N4005 IA120 AA3528
Product Description
Full Text Search
 

To Download P610ARM-B Datasheet File

  If you can't view the Datasheet, Please click here to try to view without PDF Reader .  
 
 


  Datasheet File OCR Text:
  a r m 6 1 0 d a t a s h e e t z a r l i n k p a r t n u m b e r : p 6 1 0 a r m - b / k g / f p n r n o t e s 1 ) t h e o r i g i n a l p 6 1 0 a r m / k g / f p n r i s o b s o l e t e 2 ) t h i s d a t a s h e e t i n c l u d e s t h e p e r f o r m a n c e d a t a p r e v i o u s l y s u p p l i e d i n s u p p l e m e n t m s 4 3 9 7 - j a n 1 9 9 6 d s 3 5 5 4 i s s u e 3 . 2 o c t o b e r 2 0 0 1 m a n u f a c t u r e d u n d e r l i c e n c e f r o m a d v a n c e d r i s c m a c h i n e s l t d a r m a n d t h e a r m l o g o a r e t r a d e m a r k s o f a d v a n c e d r i s c m a c h i n e s l t d a d v a n c e d r i s c m a c h i n e s l t d 1 9 9 9 p 6 1 0 a r m - b / k w / f p n r
arm610 data sheet preface-ii preface the arm610 is a general purpose 32-bit microprocessor with 4 kbyte cache, write buffer and memory management unit (mmu) combined in a single chip. the arm610 offers high level risc performance yet its fully static design ensures minimal power consumption, making it ideal for portable, low-cost systems. the innovative mmu supports a conventional two-level page-table structure and a number of extensions which make it ideal for embedded control, unix and object oriented systems. this results in a high instruction throughput and impressive real-time interrupt response from a small and cost-effective chip. applications the arm610 is ideally suited to those applications requiring risc performance from a compact, power efficient processor. these include: personal computer devices eg.pdas high-performance real-time control systems portable telecommunications data communications equipment consumer products automotive feature summary high performance risc 25 mips sustained @ 33 mhz (33 mips peak) fast sub microsecond interrupt response for real-time applications memory management unit (mmu) support for virtual memory systems excellent high-level language support 4kbyte of instruction & data cache big and little endian operating modes write buffer enhancing performance ieee 1149.1 boundary scan fully static operation, low power consumption ideal for power sensitive applications 144 thin quad flat pack (tqfp) package address bus jtag arm6 write mmu 4kbyte control cpu buffer cache
arm610 data sheet contents-1 1 introduction 1-1 1.1 introduction 1-2 1.2 block diagram 1-4 1.3 functional diagram 1-5 2 signal description 2-1 2.1 signal description 2-2 3 programmer?s model 3-1 3.1 introduction 3-2 3.2 register configuration 3-2 3.3 operating mode selection 3-3 3.4 registers 3-3 3.5 exceptions 3-6 3.6 reset 3-10 4 instruction set 4-1 4.1 instruction set summary 4-2 4.2 the condition field 4-5 4.3 branch and branch with link (b, bl) 4-7 4.4 data processing 4-9 4.5 psr transfer (mrs, msr) 4-17 4.6 multiply and multiply-accumulate (mul, mla) 4-22 4.7 single data transfer (ldr, str) 4-24 4.8 halfword and signed data transfer 4-30 4.9 block data transfer (ldm, stm) 4-36 4.10 single data swap (swp) 4-43 4.11 software interrupt (swi) 4-45 4.12 coprocessor data operations (cdp) 4-47 contents toc
contents arm610 data sheet contents-2 4.13 coprocessor data transfers (ldc, stc) 4-49 4.14 coprocessor register transfers (mrc, mcr) 4-53 4.15 undefined instruction 4-55 4.16 instruction set examples 4-56 5 configuration 5-1 5.1 configuration 5-2 5.2 internal coprocessor instructions 5-2 5.3 registers 5-2 6 instruction and data cache (idc) 6-1 6.1 introduction 6-2 6.2 cacheable bit - c 6-2 6.3 updateable bit - u 6-2 6.4 idc operation 6-2 6.5 idc validity 6-3 6.6 read-lock-write 6-3 6.7 idc enable/disable and reset 6-4 7 write buffer (wb) 7-1 7.1 introduction 7-2 7.2 bufferable bit 7-2 7.3 write buffer operation 7-2 8 coprocessors 8-1 8.1 overview 8-2 9 memory management unit 9-1 9.1 memory management unit (mmu) 9-2 9.2 mmu program accessible registers 9-2 9.3 address translation 9-3 9.4 translation process 9-4 9.5 level one descriptor 9-5 9.6 page table descriptor 9-5 9.7 section descriptor 9-6 9.8 translating section references 9-7 9.9 level two descriptor 9-8 9.10 translating small page references 9-9 9.11 translating large page references 9-10 9.12 mmu faults and cpu aborts 9-11 9.13 fault address and fault status registers (far and fsr) 9-11 9.14 domain access control 9-13 9.15 fault checking sequence 9-14 9.16 external aborts 9-16 9.17 interaction of the mmu, idc and write buffer 9-17 9.18 effect of reset 9-18 10 bus interface 10-1 10.1 introduction 10-2 10.2 arm610 cycle speed 10-2
contents arm610 data sheet contents-3 10.3 cycle types 10-2 10.4 memory access 10-2 10.5 read/write 10-3 10.6 byte/word 10-3 10.7 maximum sequential length 10-3 10.8 memory access types 10-5 10.9 arm610 cycle type summary 10-9 11 boundary-scan test interface 11-1 11.1 introduction 11-2 11.2 overview 11-2 11.3 reset 11-3 11.4 pullup resistors 11-3 11.5 instruction register 11-3 11.6 public instructions 11-3 11.7 test data registers 11-7 11.8 boundary-scan interface signals 11-10 12 dc parameters 12-1 12.1 absolute maximum ratings 12-2 12.2 dc operating conditions 12-2 12.3 dc characteristics 12-3 13 ac parameters 13-1 13.1 test conditions 13-2 13.2 relationship between fclk and mclk 13-2 13.3 main bus signals 13-4 14 physical details 14-1 14.1 physical details 14-2 15 pinout 15-1 15.1 pinout 15-2 backward compatibility a-1 backward compatibility a-2
contents arm610 data sheet contents-4
arm610 data sheet 1-1 introduction this chapter introduces the arm610 datasheet. 1.1 introduction 1-2 1.2 block diagram 1-4 1.3 functional diagram 1-5 1
introduction arm610 data sheet 1-2 1.1 introduction arm610, is a general purpose 32-bit microprocessor with 4kbyte cache, write buffer and memory management unit (mmu) combined in a single chip. the cpu within arm610 is the arm6. the arm610 is software compatible with the arm processor family and can be used with arm support chips, eg. i/o, memory and video. the arm610 architecture is based on 'reduced instruction set computer' (risc) principles, and the instruction set and related decode mechanism are greatly simpli?d compared with microprogrammed 'complex instruction set computers' (cisc). the on-chip mixed data and instruction cache together with the write buffer substantially raise the average execution speed and reduce the average amount of memory bandwidth required by the processor. this allows the external memory to support additional processors or direct memory access (dma) channels with minimal performance loss. the mmu supports a conventional two-level page-table structure and a number of extensions which make it ideal for embedded control, unix and object oriented systems. the instruction set comprises ten basic instruction types: two of these make use of the on-chip arithmetic logic unit, barrel shifter and multiplier to perform high-speed operations on the data in a bank of 31 registers, each 32 bits wide. three classes of instruction control data transfer between memory and the registers, one optimised for flexibility of addressing, another for rapid context switching and the third for swapping data. two instructions control the flow and privilege level of execution. three types are dedicated to the control of external coprocessors which allow the functionality of the instruction set to be extended off-chip in an open and uniform way. the arm instruction set is a good target for compilers of many different high-level languages. where required for critical code segments, assembly code programming is also straightforward, unlike some risc processors which depend on sophisticated compiler technology to manage complicated instruction interdependencies. the memory interface has been designed to allow the performance potential to be realised without incurring high costs in the memory system. speed-critical control signals are pipelined to allow system control functions to be implemented in standard low-power logic, and these control signals permit the exploitation of paged mode access offered by industry standard drams. arm610 is a fully static part and has been designed to minimise its power requirements. this makes it ideal for portable applications where both these features are essential.
introduction arm610 data sheet 1-3 datasheet notation 0x marks a hexadecimal quantity bold external signals are shown in bold capital letters binary where it is not clear that a quantity is binary it is followed by the word binary
introduction arm610 data sheet 1-4 1.2 block diagram figure 1-1: arm610 block diagram mmu cache cpu write buffer address buffer c o n t r o l clock abe jtag test tck tdi tms ntrst tdo nwait mclk sna fclk nreset mse nmreq seq abort nirq nfiq internal data bus d[31:0] dbe internal address bus ale a[31:0] nr/w nb/w lock coproc #15 arm6 4 kbyte testout[2:0] testin[16:0]
introduction arm610 data sheet 1-5 1.3 functional diagram figure 1-2: functional diagram abe dbe ale mse nirq nfiq bus interrupts nreset sna fclk mclk nwait clocks vdd vss power tck tdi tdo tms ntrst jtag nrw nbw lock d[31:0] a[31:0] address bus data bus control bus nmreq seq abort memory interface controls chip test test arm610 testout[2:0] testin[16:0]
introduction arm610 data sheet 1-6
arm610 data sheet 2-1 signal description this chapter gives information on the arm610 signals. 2.1 signal description 2-2 2
signal description arm610 data sheet 2-2 2.1 signal description key to signal types it input, ttl threshold ocz output, cmos levels, tristateable itotz input/output tristateable, ttl thresholds ick input, clock levels name type description a[31:0] ocz address bus. this bus signals the address requested for memory accesses. normally it changes during mclk high. abe it address bus enable. when this input is low, the address bus a[31:0] , nrw , nbw and lock are put into a high impedance state (note 1). abort it external abort. allows the memory system to tell the processor that a requested access has failed. only monitored when arm610 is accessing external memory. ale it address latch enable. this input is used to control transparent latches on the address bus a[31:0] , nbwtt , nrw and lock . normally these signals change during mclk high, but they may be held by driving ale low. see 13.2.2 tald measurement on page 13-3 . d[31:0] itotz data bus. these are bidirectional signal paths used for data transfers between the processor and external memory. for read operations (when nrw is low), the input data must be valid before the falling edge of mclk . for write operations (when nrw is high), the output data will become valid while mclk is low. at high clock frequencies the data may not become valid until just after the mclk rising edge (see 13.3 main bus signals on page 13-3 ). dbe it data bus enable. when this input is low, the data bus, d[31:0] is put into a high impedance state (note 1). the drivers will always be high impedance except during write operations, and dbe must be driven high in systems which do not require the data bus for dma or similar activities. fclk ick fast clock input. when the arm610 cpu is accessing the cache or performing an internal cycle, it is clocked with the fast clock, fclk . lock ocz locked operation. lock is driven high, to signal a locked memory access sequence, and the memory manager should wait until lock goes low before allowing another device to access the memory. lock changes while mclk is high and remains high during the locked memory sequence. lock is latched by ale . mclk ick memory clock input. this clock times all arm610 memory accesses. the low or high period of mclk may be stretched for slow peripherals; alternatively, the nwait input may be used with a free-running mclk to achieve similar effects. table 2-1: signal descriptions
signal description arm610 data sheet 2-3 mse it memory request/sequential enable. when this input is low, the nmreq and seq outputs are put into a high impedance state (note 1). nbw ocz not byte / word. an output signal used by the processor to indicate to the external memory system when a data transfer of a byte length is required. nbw is high for word transfers and low for byte transfers, and is valid for both read and write operations. the signal changes while mclk is high. nbw is latched by ale . nfiq it not fast interrupt request. if fiqs are enabled, the processor will respond to a low level on this input by taking the fiq interrupt exception. this is an asynchronous, level-sensitive input, and must be held low until a suitable response is received from the processor. nirq it not interrupt request. as nfiq , but with lower priority. may be taken low asynchronously to interrupt the processor when the irq enable is active. nmreq ocz not memory request. a pipelined signal that changes while mclk is low to indicate whether or not in the following cycle, the processor will be accessing external memory. when nmreq is low, the processor will be accessing external memory. nreset it not reset. this is a level sensitive input which is used to start the processor from a known address. a low level will cause the current instruction to terminate abnormally, and the on-chip cache, mmu, and write buffer to be disabled. when nreset is driven high, the processor will re-start from address 0. nreset must remain low for at least two full fclk cycles or ?e full mclk cycles whichever is greater. while nreset is low the processor will perform idle cycles with incrementing addresses and nwait must be high. nrw ocz not read/write. when high this signal indicates a processor write operation; when low, a read. the signal changes while mclk is high. nrw is latched by ale . ntrst it test interface reset. note this pin does not have an internal pullup resistor. this pin must be pulsed or driven low to achieve normal device operation, in addition to the normal device reset ( nreset ). nwait it not wait. when low this allows extra mclk cycles to be inserted in memory accesses. it must change during the low phase of the mclk cycle to be extended. seq ocz sequential address. this signal is the inverse of nmreq , and is provided for compatibility with existing arm memory systems. sna it this pin should be hard wired high. test in[16:0] it test bus input. this bus is used for off-board testing of the device. when the device is ?ted to a circuit all these pins must be tied low. name type description table 2-1: signal descriptions (continued) table 2-1: signal descriptions (continued)
signal description arm610 data sheet 2-4 notes 1 when output pads are placed in the high impedance state for long periods, take care that they do not ?at to an unde?ed logic level, as this can dissipate power, especially in the pads. 2 although the input pads have ttl thresholds, and will correctly interpret a ttl level input, note that unless all inputs are driven to the voltage rails, the input circuits will consume power. test out[2:0] ocz test bus output. this bus is used for off-board testing of the device. when the device is ?ted to a circuit and all the testin[16:0] pins are driven low, these three outputs will be driven low. note that these pins may not be tristated, except via the jtag test port. tck it test interface reference clock. this times all the transfers on the jtag test interface. tdi it test interface data input. note this pin does not have an internal pullup resistor. tdo ocz test interface data output. note this pin does not have an internal pullup resistor. tms it test interface mode select. note this pin does not have an internal pullup resistor. vdd positive supply. 16 pins are allocated to vdd in the 160 pqfp package. vss ground supply. 16 pins are allocated to vss in the 160 pqfp package. name type description table 2-1: signal descriptions (continued) table 2-1: signal descriptions (continued)
arm610 data sheet 3-1 programmer? s model this chapter describes the programmer?s mode l for the arm610. 3.1 introduction 3-2 3.2 register con?uration 3-2 3.3 operating mode selection 3-3 3.4 registers 3-3 3.5 exceptions 3-6 3.6 reset 3-10 3
programmer? model arm610 data sheet 3-2 3.1 introduction arm610 supports a variety of operating con?urations. some are controlled by register bits and are known as the register con?urations . others may be controlled by software and these are known as operating modes . 3.2 register con?uration the arm610 processor provides 4 register con?urations which may be changed while the processor is running and which are detailed in chapter 4, instruction set. the bigend bit, in the control register, sets whether the arm610 treats words in memory as being stored in big-endian or little-endian format, see chapter 5, configuration . memory is viewed as a linear collection of bytes numbered upwards from zero. bytes 0 to 3 hold the ?st stored word, bytes 4 to 7 the second and so on. in the little-endian scheme the lowest numbered byte in a word is considered to be the least signi?ant byte of the word and the highest numbered byte is the most signi?ant. byte 0 of the memory system should be connected to data lines 7 through 0 ( d[7:0] ) in this scheme. in the big-endian scheme the most signi?ant byte of a word is stored at the lowest numbered byte and the least signi?ant byte is stored at the highest numbered byte. byte 0 of the memory system should therefore be connected to data lines 31 through 24 ( d[31:24] ). the lateabt bit in the control register, see chapter 5, configuration , sets the processor's behaviour when a data abort exception occurs. it only affects the behaviour of load/store register instructions and is discussed more fully in chapter 3, programmer? model and chapter 4, instruction set . the other two con?uration bits, prog32 and data32 are used for backward compatibility with earlier arm processors (see appendix a-1) but should normally be set to 1. this con?uration extends the address space to 32 bits, introduces major changes in the programmer's model as described below and provides support for running existing 26-bit programs in the 32-bit environment. this mode is recommended for compatibility with future arm processors and all new code should be written to use only the 32-bit operating modes. because the original arm instruction set has been modi?d to accommodate 32-bit operation, there are certain additional restrictions which programmers must be aware of. these are indicated in the text by the words shall and shall not. reference should also be made to the arm application notes ?ules for arm code writers and ?otes for arm code writers available from your supplier.
programmer? model arm610 data sheet 3-3 3.3 operating mode selection arm610 has a 32-bit data bus and a 32-bit address bus. the data types the processor supports are bytes (8 bits) and words (32 bits), where words must be aligned to four byte boundaries. instructions are exactly one word, and data operations (e.g. add) are only performed on word quantities. load and store operations can transfer either bytes or words. arm610 supports six modes of operation: 1 user mode (usr): the normal program execution state 2 fiq mode (?): designed to support a data transfer or channel process 3 irq mode (irq): used for general purpose interrupt handling 4 supervisor mode (svc): a protected mode for the operating system 5 abort mode (abt): entered after a data or instruction prefetch abort 6 unde?ed mode (und): entered when an unde?ed instruction is executed mode changes may be made under software control or may be brought about by external interrupts or exception processing. most application programs will execute in user mode. the other modes, known as privileged modes , will be entered to service interrupts or exceptions or to access protected resources. 3.4 registers the processor has a total of 37 registers made up of 31 general 32-bit registers and 6 status registers. at any one time 16 general registers (r0 to r15) and one or two status registers are visible to the programmer. the visible registers depend on the processor mode and the other registers (the banked registers ) are switched in to support irq, fiq, supervisor, abort and unde?ed mode processing. the register bank organisation is shown in figure 3-1: register organisation on page 3-4. the banked registers are shaded in the diagram. in all modes 16 registers, r0 to r15, are directly accessible. all registers except r15 are general purpose and may be used to hold data or address values. register r15 holds the program counter (pc). when r15 is read, bits [1:0] are zero and bits [31:2] contain the pc. a seventeenth register (the cpsr - current program status register) is also accessible. it contains condition code ?gs and the current mode bits and may be thought of as an extension to the pc. r14 is used as the subroutine link register and receives a copy of r15 when a branch and link instruction is executed. it may be treated as a general purpose register at all other times. r14_svc, r14_irq, r14_?, r14_abt and r14_und are used similarly to hold the return values of r15 when interrupts and exceptions arise, or when branch and link instructions are executed within interrupt or exception routines.
programmer? model arm610 data sheet 3-4 figure 3-1: register organisation fiq mode has seven banked registers mapped to r8-14 (r8_?-r14_?). many fiq programs will not need to save any registers. user mode, irq mode, supervisor mode, abort mode and unde?ed mode each have two banked registers mapped to r13 and r14. the two banked registers allow these modes to each have a private stack pointer and link register. supervisor, irq, abort and unde?ed mode programs which require more than these two banked registers are expected to save some or all of the caller's registers (r0 to r12) on their respective stacks. they are then free to use these registers which they will restore before returning to the caller. in addition there are also ?e spsrs (saved program status registers) which are loaded with the cpsr when an exception occurs. there is one spsr for each privileged mode. general registers and program counter modes r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 r14 r15 (pc) r0 r1 r2 r3 r4 r5 r6 r7 r8_? r9_? r10_? r11_? r12_? r13_? r14_? r15 (pc) r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13_svc r14_svc r15 (pc) r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13_abt r14_abt r15 (pc) r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13_irq r14_irq r15 (pc) r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13_und r14_und r15 (pc) user32 fiq32 supervisor32 abort32 irq32 unde?ed32 cpsr cpsr spsr_? cpsr spsr_svc cpsr spsr_abt cpsr spsr_irq cpsr spsr_und program status registers
programmer? model arm610 data sheet 3-5 figure 3-2: format of the program status registers (psrs) the format of the program status registers is shown in figure 3-2: format of the program status registers (psrs) . the n, z, c and v bits are the condition code flags . the condition code ?gs in the cpsr may be changed as a result of arithmetic and logical operations in the processor and may be tested by all instructions to determine if the instruction is to be executed. the i and f bits are the interrupt disable bits. the i bit disables irq interrupts when it is set and the f bit disables fiq interrupts when it is set. the m0, m1, m2, m3 and m4 bits (m[4:0]) are the mode bits , and these determine the mode in which the processor operates. the interpretation of the mode bits is shown. not all combinations of the mode bits de?e a valid processor mode. only those explicitly described shall be used. the bottom 28 bits of a psr (incorporating i, f and m[4:0]) are known collectively as the control bits . the control bits will change when an exception arises and in addition can be manipulated by software when the processor is in a privileged mode. unused bits in the psrs are reserved and their state shall be preserved when changing the ?g or control bits. programs shall not rely on speci? values from the reserved bits when checking the psr status, since they may read as one or zero in future processors. 0 1 2 3 4 5 6 7 8 27 28 29 30 31 m0 m1 m2 m3 m4 . f i v c z n overflow carry / borrow / extend zero negative / less than mode bits fiq disable irq disable .. . flags control m[4:0] mode accessible register set 10000 usr pc, r14..r0 cpsr 10001 fiq pc, r14_fiq..r8_fiq, r7..r0 cpsr, spsr_fiq 10010 irq pc, r14_irq..r13_irq, r12..r0 cpsr, spsr_irq 10011 svc pc, r14_svc..r13_svc, r12..r0 cpsr, spsr_svc 10111 abt pc, r14_abt..r13_abt, r12..r0 cpsr, spsr_abt 11011 und pc, r14_und..r13_und, r12..r0 cpsr, spsr_und table 3-1: the mode bits
programmer? model arm610 data sheet 3-6 3.5 exceptions exceptions arise whenever there is a need for the normal ?w of program execution to be broken, so that (for example) the processor can be diverted to handle an interrupt from a peripheral. the processor state just prior to handling the exception must be preserved so that the original program can be resumed when the exception routine has completed. many exceptions may arise at the same time. arm610 handles exceptions by making use of the banked registers to save state. the old pc and cpsr contents are copied into the appropriate r14 and spsr and the pc and mode bits in the cpsr bits are forced to a value which depends on the exception. interrupt disable ?gs are set where required to prevent otherwise unmanageable nestings of exceptions. in the case of a re-entrant interrupt handler, r14 and the spsr should be saved onto a stack in main memory before re-enabling the interrupt; when transferring the spsr register to and from a stack, it is important to transfer the whole 32-bit value, and not just the ?g or control ?lds. when multiple exceptions arise simultaneously, a ?ed priority determines the order in which they are handled. the priorities are listed later in this chapter. 3.5.1 fiq the fiq (fast interrupt request) exception is externally generated by taking the nfiq input low. this input can accept asynchronous transitions, and is delayed by one clock cycle for synchronisation before it can affect the processor execution ?w. it is designed to support a data transfer or channel process, and has suf?ient private registers to remove the need for register saving in such applications (thus minimising the overhead of context switching). the fiq exception may be disabled by setting the f ?g in the cpsr (but note that this is not possible from user mode). if the f ?g is clear, arm610 checks for a low level on the output of the fiq synchroniser at the end of each instruction. when a fiq is detected, arm610 performs the following: 1 saves the address of the next instruction to be executed plus 4 in r14_?; saves cpsr in spsr_? 2 forces m[4:0]=10001 (fiq mode) and sets the f and i bits in the cpsr 3 forces the pc to fetch the next instruction from address 0x1c to return normally from fiq, use subs pc, r14_?,#4 which will restore both the pc (from r14) and the cpsr (from spsr_?) and resume execution of the interrupted code. 3.5.2 irq the irq (interrupt request) exception is a normal interrupt caused by a low level on the nirq input. it has a lower priority than fiq, and is masked out when a fiq sequence is entered. its effect may be masked out at any time by setting the i bit in the cpsr (but note that this is not possible from user mode). if the i ?g is clear, arm610 checks for a low level on the output of the irq synchroniser at the end of each instruction.
programmer? model arm610 data sheet 3-7 when an irq is detected, arm610 performs the following: 1 saves the address of the next instruction to be executed plus 4 in r14_irq; saves cpsr in spsr_irq 2 forces m[4:0]=10010 (irq mode) and sets the i bit in the cpsr 3 forces the pc to fetch the next instruction from address 0x18 to return normally from irq, use subs pc,r14_irq,#4 which will restore both the pc and the cpsr and resume execution of the interrupted code. 3.5.3 abort an abort can be signalled by either the internal memory management unit or from the external abort input. abort indicates that the current memory access cannot be completed. for instance, in a virtual memory system the data corresponding to the current address may have been moved out of memory onto a disc, and considerable processor activity may be required to recover the data before the access can be performed successfully. arm610 checks for abort during memory access cycles. when successfully aborted arm610 will respond in one of two ways: 1 if the abort occurred during an instruction prefetch (a prefetch abort ), the prefetched instruction is marked as invalid but the abort exception does not occur immediately. if the instruction is not executed, for example as a result of a branch being taken while it is in the pipeline, no abort will occur. an abort will take place if the instruction reaches the head of the pipeline and is about to be executed. 2 if the abort occurred during a data access (a dat a abort ), the action depends on the instruction type. a) single data transfer instructions (ldr, str) are aborted as though the instruction had not executed if the processor is con?ured for early abort. when con?ured for late abort, these instructions are able to write back modi?d base registers and the abort handler must be aware of this. b) the swap instruction (swp) is aborted as though it had not executed, though externally the read access may take place. c) block data transfer instructions (ldm, stm) complete, and if write-back is set, the base is updated. if the instruction would normally have overwritten the base with data (i.e. ldm with the base in the transfer list), this overwriting is prevented. all register overwriting is prevented after the abort is indicated, which means in particular that r15 (which is always last to be transferred) is preserved in an aborted ldm instruction. note that on data aborts the arm610 fault address and fault status registers are updated.
programmer? model arm610 data sheet 3-8 when either a prefetch or data abort occurs, arm610 performs the following: 1 saves the address of the aborted instruction plus 4 (for prefetch aborts) or 8 (for data aborts) in r14_abt; saves cpsr in spsr_abt. 2 forces m[4:0]=10111 (abort mode) and sets the i bit in the cpsr. 3 forces the pc to fetch the next instruction from either address 0x0c (prefetch abort) or address 0x10 (data abort). to return after ?ing the reason for the abort, use subs pc,r14_abt,#4 (for a prefetch abort) or subs pc,r14_abt,#8 (for a data abort). this will restore both the pc and the cpsr and retry the aborted instruction. the abort mechanism allows a demand paged virtual memory system to be implemented when suitable memory management software is available. the processor is allowed to generate arbitrary addresses, and when the data at an address is unavailable the mmu signals an abort. the processor traps into system software which must work out the cause of the abort, make the requested data available, and retry the aborted instruction. the application program needs no knowledge of the amount of memory available to it, nor is its state in any way affected by the abort. note that there are restrictions on the use of the external abort pin. see chapter 9, memory management unit . 3.5.4 software interrupt the software interrupt instruction (swi) is used for getting into supervisor mode, usually to request a particular supervisor function. when a swi is executed, arm610 performs the following: 1 saves the address of the swi instruction plus 4 in r14_svc; saves cpsr in spsr_svc 2 forces m[4:0]=10011 (supervisor mode) and sets the i bit in the cpsr 3 forces the pc to fetch the next instruction from address 0x08 to return from a swi, use movs pc,r14_svc. this will restore the pc and cpsr and return to the instruction following the swi. 3.5.5 unde?ed instruction trap when the arm610 comes across an instruction which it cannot handle (see chapter 4, instruction set ), it offers it to any coprocessors which may be present. if a coprocessor can perform this instruction but is busy at that time, arm610 will wait until the coprocessor is ready or until an interrupt occurs. if no coprocessor can handle the instruction then arm610 will take the unde?ed instruction trap. the trap may be used for software emulation of a coprocessor in a system which does not have the coprocessor hardware, or for general purpose instruction set extension by software emulation.
programmer? model arm610 data sheet 3-9 when arm610 takes the unde?ed instruction trap it performs the following: 1 saves the address of the unde?ed or coprocessor instruction plus 4 in r14_und; saves cpsr in spsr_und 2 forces m[4:0]=11011 (unde?ed mode) and sets the i bit in the cpsr 3 forces the pc to fetch the next instruction from address 0x04 to return from this trap after emulating the failed instruction, use movs pc,r14_und. this will restore the cpsr and return to the instruction following the unde?ed instruction. 3.5.6 vector summary these are byte addresses, and will normally contain a branch instruction pointing to the relevant routine. the fiq routine might reside at 0x1c onwards, and thereby avoid the need for (and execution time of) a branch instruction. the reserved entry is for an address exception vector which is only operative when the processor is con?ured for a 26-bit program space. see chapter a, backward compatibility . address exception mode on entry 0x00000000 reset supervisor 0x00000004 undefined instruction undefined 0x00000008 software interrupt supervisor 0x0000000c abort (prefetch) abort 0x00000010 abort (data) abort 0x00000014 -- reserved -- -- 0x00000018 irq irq 0x0000001c fiq fiq table 3-2: vector summary
programmer? model arm610 data sheet 3-10 3.5.7 exception priorities when multiple exceptions arise at the same time, a ?ed priority system determines the order in which they will be handled: 1 reset (highest priority) 2 data abort 3 fiq 4 irq 5 prefetch abort 6 unde?ed instruction, software interrupt (lowest priority) note that not all exceptions can occur at once. unde?ed instruction and software interrupt are mutually exclusive since they each correspond to particular (non- overlapping) decodings of the current instruction. if a data abort occurs at the same time as a fiq, and fiqs are enabled (i.e. the f ?g in the cpsr is clear), arm610 will enter the data abort handler and then immediately proceed to the fiq vector. a normal return from fiq will cause the data abort handler to resume execution. placing data abort at a higher priority than fiq is necessary to ensure that the transfer error does not escape detection; the time for this exception entry should be added to worst case fiq latency calculations. 3.5.8 interrupt latencies calculating the worst case interrupt latency for the arm610 is quite complex due to the cache, mmu and write buffer and is dependant on the con?uration of the whole system. please see application note - calculating the arm610 interrupt latency. 3.6 reset when the nreset signal goes low, arm610 abandons the executing instruction and then performs idle cycles from incrementing word addresses. at the end of the reset sequence arm610 performs either 1 or 2 memory accesses from the address reached before nreset goes high. when nreset goes high again, arm610 performs the following: 1 overwrites r14_svc and spsr_svc by copying the current values of the pc and cpsr into them. the value of the saved pc and cpsr is not de?ed. 2 forces m[4:0]=10011 (supervisor mode) and sets the i and f bits in the cpsr. 3 performs either one or two memory accesses from the address output at the end of the reset. 4 forces the pc to fetch the next instruction from address 0x00
programmer? model arm610 data sheet 3-11 at the end of the reset sequence, the mmu is disabled and the tlb is ?shed, so forces at translation (i.e. the physical address is the virtual address, and there is no permission checking); alignment faults are also disabled; the cache is disabled and ?shed; the write buffer is disabled and ?shed; the arm6 cpu core is put into 26-bit data and address mode, with early abort timing and little-endian mode.
programmer? model arm610 data sheet 3-12
arm610 data sheet 4-1 open access instruction set this chapter describes the arm instruction set. 4.1 instruction set summary 4-2 4.2 the condition field 4-5 4.3 branch and branch with link (b, bl) 4-7 4.4 data processing 4-9 4.5 psr transfer (mrs, msr) 4-17 4.6 multiply and multiply-accumulate (mul, mla) 4-22 4.7 single data transfer (ldr, str) 4-24 4.8 halfword and signed data transfer 4-30 4.9 block data transfer (ldm, stm) 4-36 4.10 single data swap (swp) 4-43 4.11 software interrupt (swi) 4-45 4.12 coprocessor data operations (cdp) 4-47 4.13 coprocessor data transfers (ldc, stc) 4-49 4.14 coprocessor register transfers (mrc, mcr) 4-53 4.15 unde?ed instruction 4-55 4.16 instruction set examples 4-56 4
instruction set - summary arm610 data sheet 4-2 open access 4.1 instruction set summary 4.1.1 format summary the arm instruction set formats are shown below. figure 4-1: arm instruction set formats note some instruction codes are not defined but do not cause the undefined instruction trap to be taken, for instance a multiply instruction with bit 6 changed to a 1. these instructions should not be used, as their action may change in future arm implementations. 313029282726252423222120191817161514131211109876543210 cond 0 0 i opcode s rn rd operand 2 data processing / psr transfer cond 000000as rd rn rs 1001 rm multiply cond 00010b00 rn rd 00001001 rm single data swap cond 0 1 i p u b w l rn rd offset single data transfer cond 0 1 1 1 undefined cond 1 0 0 p u s w l rn register list block data transfer cond 101l offset branch cond 1 1 0 p u n w l rn crd cp# offset coprocessor data transfer cond 1110 cp opc crn crd cp# cp 0 crm coprocessor data operation cond 1110cp opcl crn rd cp# cp 1 crm coprocessor register transfer cond 1111 ignored by processor software interrupt 313029282726252423222120191817161514131211109876543210
instruction set - summary arm610 data sheet 4-3 open access 4.1.2 instruction summary mnemonic instruction action see section: adc add with carry rd := rn + op2 + carry 4.4 add add rd := rn + op2 4.4 and and rd := rn and op2 4.4 b branch r15 := address 4.3 bic bit clear rd := rn and not op2 4.4 bl branch with link r14 := r15, r15 := address 4.3 cdp coprocessor data process- ing (coprocessor-specific) 4.12 cmn compare negative cpsr flags := rn + op2 4.4 cmp compare cpsr flags := rn - op2 4.4 eor exclusive or rd := (rn and not op2) or (op2 and not rn) 4.4 ldc load coprocessor from memory coprocessor load 4.13 ldm load multiple registers stack manipulation (pop) 4.9 ldr load register from memory rd := (address) 4.7, 4.8 mcr move cpu register to coprocessor register crn := rrn {crm} 4.14 mla multiply accumulate rd := (rm * rs) + rn 4.6 mov move register or constant rd : = op2 4.4 mrc move from coprocessor register to cpu register rn := crn {crm} 4.14 mrs move psr status/flags to register rn := psr 4.5 msr move register to psr status/flags psr := rm 4.5 mul multiply rd := rm * rs 4.6 mvn move negative register rd := 0xffffffff eor op2 4.4 orr or rd := rn or op2 4.4 rsb reverse subtract rd := op2 - rn 4.4 table 4-1: the arm instruction set
instruction set - summary arm610 data sheet 4-4 open access rsc reverse subtract with carry rd := op2 - rn - 1 + carry 4.4 sbc subtract with carry rd := rn - op2 - 1 + carry 4.4 stc store coprocessor register to memory address := crn 4.13 stm store multiple stack manipulation (push) 4.9 str store register to memory
:= rd 4.7, 4.8 sub subtract rd := rn - op2 4.4 swi software interrupt os call 4.11 swp swap register with memory rd := [rn], [rn] := rm 4.10 teq test bitwise equality cpsr flags := rn eor op2 4.4 tst test bits cpsr flags := rn and op2 4.4 mnemonic instruction action see section: table 4-1: the arm instruction set (continued)
instruction set - condition field arm610 data sheet 4-5 open access 4.2 the condition field in arm state, all instructions are conditionally executed according to the state of the cpsr condition codes and the instruction?s condition teld. this teld (bits 31:28) determines the circumstances under which an instruction is to be executed. if the state of the c, n, z and v ?ags fultls the conditions encoded by the teld, the instruction is executed, otherwise it is ignored. there are 16 possible conditions, each represented by a two-character suftx that can be appended to the instruction?s mnemonic. for example, a branch ( b in assembly language) becomes beq for "branch if equal", which means the branch will only be taken if the z ?g is set. in practice, 15 different conditions may be used: these are listed in table 4-2: condition code summary . the sixteenth (1111) is reserved, and must not be used. in the absence of a suftx, the condition teld of most instructions is set to "always" (suftx al). this means the instruction will always be executed regardless of the cpsr condition codes. 31 28 27 0 cond condition field 0000 = eq - z set (equal) 0001 = ne - z clear (not equal) 0010 = cs - c set (unsigned higher or same) 0011 = cc - c clear (unsigned lower) 0100 = mi - n set (negative) 0101 = pl - n clear (positive or zero) 0110 = vs - v set (over?ow) 0111 = vc - v clear (no over?ow) 1000 = hi - c set and z clear (unsigned higher) 1001 = ls - c clear or z set (unsigned lower or same) 1010 = ge - n set and v set, or n clear and v clear (greater or equal) 1011 = lt - n set and v clear, or n clear and v set (less than) 1100 = gt - z clear, and either n set and v set, or n clear and v clear (greater than) 1101 = le - z set, or n set and v clear, or n clear and v set (less than or equal) 1110 = al - always 1111 = nv - never
instruction set - condition field arm610 data sheet 4-6 open access code suffix flags meaning 0000 eq z set equal 0001 ne z clear not equal 0010 cs c set unsigned higher or same 0011 cc c clear unsigned lower 0100 mi n set negative 0101 pl n clear positive or zero 0110 vs v set overflow 0111 vc v clear no overflow 1000 hi c set and z clear unsigned higher 1001 ls c clear or z set unsigned lower or same 1010 ge n equals v greater or equal 1011 lt n not equal to v less than 1100 gt z clear and (n equals v) greater than 1101 le z set or (n not equal to v) less than or equal 1110 al (ignored) always table 4-2: condition code summary
instruction set - b, bl arm610 data sheet 4-7 open access 4.3 branch and branch with link (b, bl) the instruction is only executed if the condition is true. the various conditions are detned table 4-2: condition code summary on page 4-6. the instruction encoding is shown in figure 4-2: branch instructions , below. figure 4-2: branch instructions branch instructions contain a signed two's complement 24-bit offset. this is shifted left two bits, sign extended to 32 bits, and added to the pc. the instruction can therefore specify a branch of +/- 32mbytes. the branch offset must take account of the prefetch operation, which causes the pc to be 2 words (8 bytes) ahead of the current instruction. branches beyond +/- 32mbytes must use an offset or absolute destination which has been previously loaded into a register. in this case the pc should be manually saved in r14 if a branch with link type operation is required. 4.3.1 the link bit branch with link (bl) writes the old pc into the link register (r14) of the current bank. the pc value written into r14 is adjusted to allow for the prefetch, and contains the address of the instruction following the branch and link instruction. note that the cpsr is not saved with the pc and r14[1:0] are always cleared. to return from a routine called by branch with link use mov pc,r14 if the link register is still valid or ldm rn!,{..pc} if the link register has been saved onto a stack pointed to by rn. 31 28 27 25 24 23 0 cond 101 l offset link bit 0 = branch 1 = branch with link condition field
instruction set - b, bl arm610 data sheet 4-8 open access 4.3.2 assembler syntax items in {} are optional. items in <> must be present. b{l}{cond} {l} is used to request the branch with link form of the instruction. if absent, r14 will not be affected by the instruction. {cond} is a two-character mnemonic as shown in table 4-2: condition code summary on page 4-6. if absent then al (always) will be used. is the destination. the assembler calculates the offset. 4.3.3 examples here bal here ; assembles to 0xeafffffe (note effect of ; pc offset). b there ; always condition used as default. cmp r1,#0 ; compare r1 with zero and branch to fred ; if r1 was zero, otherwise continue beq fred ; continue to next instruction. bl sub+rom ; call subroutine at computed address. adds r1,#1 ; add 1 to register 1, setting cpsr flags ; on the result then call subroutine if blcc sub ; the c flag is clear, which will be the ; case unless r1 held 0xffffffff.
instruction set - data processing arm610 data sheet 4-9 open access 4.4 data processing the data processing instruction is only executed if the condition is true. the conditions are detned in table 4-2: condition code summary on page 4-6. the instruction encoding is shown in figure 4-3: data processing instructions below. 31 28 27 26 25 24 21 20 19 16 15 12 11 0 cond 00 i opcode s rn rd operand 2 destination register 1st operand register set condition codes 0 = do not alter condition codes 1 = set condition codes operation code 0000 = and - rd:= op1 and op2 0001 = eor - rd:= op1 eor op2 0010 = sub - rd:= op1 - op2 0011 = rsb - rd:= op2 - op1 0100 = add - rd:= op1 + op2 0101 = adc - rd:= op1 + op2 + c 0110 = sbc - rd:= op1 - op2 + c - 1 0111 = rsc - rd:= op2 - op1 + c - 1 1000 = tst - set condition codes on op1 and op2 1001 = teq - set condition codes on op1 eor op2 1010 = cmp- set condition codes on op1 - op2 1011 = cmn- set condition codes on op1 + op2 1100 = orr- rd:= op1 or op2 1101 = mov- rd:= op2 1110 = bic - rd:= op1 and not op2 1111 = mvn- rd:= not op2 immediate operand 0 = operand 2 is a register 11 4 3 0 shift rm shift applied to rm 2nd operand register 1 = operand 2 is an immediate value 11 8 7 0 rotate lmm shift applied to lmm unsigned 8-bit immediate value condition field
instruction set - data processing arm610 data sheet 4-10 open access figure 4-3: data processing instructions the instruction produces a result by performing a specited arithmetic or logical operation on one or two operands. the trst operand is always a register (rn). the second operand may be a shifted register (rm) or a rotated 8-bit immediate value (imm) according to the value of the i bit in the instruction. the condition codes in the cpsr may be preserved or updated as a result of this instruction, according to the value of the s bit in the instruction. certain operations (tst, teq, cmp, cmn) do not write the result to rd. they are used only to perform tests and to set the condition codes on the result and always have the s bit set. the instructions and their effects are listed in table 4-3: arm data processing instructions on page 4-10 . 4.4.1 cpsr ?ags the data processing operations may be classited as logical or arithmetic. logical operations the logical operations (and, eor, tst, teq, orr, mov, bic, mvn) perform the logical action on all corresponding bits of the operand or operands to produce the result. if the s bit is set (and rd is not r15, see below) the v ?ag in the cpsr will be unaffected, the c ?ag will be set to the carry out from the barrel shifter (or preserved when the shift operation is lsl #0), the z ?ag will be set if and only if the result is all zeros, and the n ?ag will be set to the logical value of bit 31 of the result . assembler mnemonic opcode action and 0000 operand1 and operand2 eor 0001 operand1 eor operand2 sub 0010 operand1 - operand2 rsb 0011 operand2 - operand1 add 0100 operand1 + operand2 adc 0101 operand1 + operand2 + carry sbc 0110 operand1 - operand2 + carry - 1 rsc 0111 operand2 - operand1 + carry - 1 tst 1000 as and, but result is not written teq 1001 as eor, but result is not written cmp 1010 as sub, but result is not written cmn 1011 as add, but result is not written orr 1100 operand1 or operand2 table 4-3: arm data processing instructions
instruction set - shifts arm610 data sheet 4-11 open access arithmetic operations the arithmetic operations (sub, rsb, add, adc, sbc, rsc, cmp, cmn) treat each operand as a 32-bit integer (either unsigned or two's complement signed, the two are equivalent). if the s bit is set (and rd is not r15) the v ?ag in the cpsr will be set if an over?ow occurs into bit 31 of the result; this may be ignored if the operands were considered unsigned, but warns of a possible error if the operands were two's complement signed. the c ?ag will be set to the carry out of bit 31 of the alu, the z ?ag will be set if and only if the result was zero, and the n ?ag will be set to the value of bit 31 of the result (indicating a negative result if the operands are considered to be two's complement signed). 4.4.2 shifts when the second operand is specited to be a shifted register, the operation of the barrel shifter is controlled by the shift teld in the instruction. this teld indicates the type of shift to be performed (logical left or right, arithmetic right or rotate right). the amount by which the register should be shifted may be contained in an immediate teld in the instruction, or in the bottom byte of another register (other than r15). the encoding for the different shift types is shown in figure 4-4: arm shift operations . figure 4-4: arm shift operations mov 1101 operand2 (operand1 is ignored) bic 1110 operand1 and not operand2 (bit clear) mvn 1111 not operand2 (operand1 is ignored) assembler mnemonic opcode action table 4-3: arm data processing instructions 11 7654 0 11 87654 rs 0 1 shift type 00= logical left 01= logical right 10= arithmetic right 11= rotate right shift amount 5-bit unsigned integer shift type 00= logical left 01= logical right 10= arithmetic right 11= rotate right shift amount shift amount specited in bottom b y te of rs
instruction set - shifts arm610 data sheet 4-12 open access instruction specited shift amount when the shift amount is specited in the instruction, it is contained in a 5-bit teld which may take any value from 0 to 31. a logical shift left (lsl) takes the contents of rm and moves each bit by the specited amount to a more signitcant position. the least signitcant bits of the result are tlled with zeros, and the high bits of rm which do not map into the result are discarded, except that the least signitcant discarded bit becomes the shifter carry output which may be latched into the c bit of the cpsr when the alu operation is in the logical class (see above). for example, the effect of lsl #5 is shown in figure 4-5: logical shift left . figure 4-5: logical shift left note lsl #0 is a special case, where the shifter carry out is the old value of the cpsr c flag. the contents of rm are used directly as the second operand. a logical shift right (lsr) is similar, but the contents of rm are moved to less signitcant positions in the result. lsr #5 has the effect shown in figure 4-6: logical shift right . figure 4-6: logical shift right 31 27 26 0 contents of rm 31 27 26 0 value of operand 2 00000 carry out 31 54 0 contents of rm 0000 0 value of operand 2 carry out
instruction set - shifts arm610 data sheet 4-13 open access the form of the shift teld which might be expected to correspond to lsr #0 is used to encode lsr #32, which has a zero result with bit 31 of rm as the carry output. logical shift right zero is redundant as it is the same as logical shift left zero, so the assembler will convert lsr #0 (and asr #0 and ror #0) into lsl #0, and allow lsr #32 to be specited. an arithmetic shift right (asr) is similar to logical shift right, except that the high bits are tlled with bit 31 of rm instead of zeros. this preserves the sign in two?s complement notation. for example, asr #5 is shown in figure 4-7: arithmetic shift right . figure 4-7: arithmetic shift right the form of the shift teld which might be expected to give asr #0 is used to encode asr #32. bit 31 of rm is again used as the carry output, and each bit of operand 2 is also equal to bit 31 of rm. the result is therefore all ones or all zeros, according to the value of bit 31 of rm. rotate right (ror) operations reuse the bits which overshoot in a logical shift right operation by reintroducing them at the high end of the result, in place of the zeros used to tll the high end in logical right operations. for example, ror #5 is shown in figure 4-8: rotate right on page 4-13. figure 4-8: rotate right 31 30 54 0 contents of rm value of operand 2 carry out 31 54 0 contents of rm value of operand 2 carry out
instruction set - shifts arm610 data sheet 4-14 open access the form of the shift teld which might be expected to give ror #0 is used to encode a special function of the barrel shifter, rotate right extended (rrx). this is a rotate right by one bit position of the 33-bit quantity formed by appending the cpsr c ?ag to the most signitcant end of the contents of rm as shown in figure 4-9: rotate right extended . figure 4-9: rotate right extended register specited shift amount only the least signitcant byte of the contents of rs is used to determine the shift amount. rs can be any general register other than r15. if this byte is zero, the unchanged contents of rm will be used as the second operand, and the old value of the cpsr c ?ag will be passed on as the shifter carry output. if the byte has a value between 1 and 31, the shifted result will exactly match that of an instruction specited shift with the same value and shift operation. if the value in the byte is 32 or more, the result will be a logical extension of the shift described above: 1 lsl by 32 has result zero, carry out equal to bit 0 of rm. 2 lsl by more than 32 has result zero, carry out zero. 3 lsr by 32 has result zero, carry out equal to bit 31 of rm. 4 lsr by more than 32 has result zero, carry out zero. 5 asr by 32 or more has result tlled with and carry out equal to bit 31 of rm. 6 ror by 32 has result equal to rm, carry out equal to bit 31 of rm. 7 ror by n where n is greater than 32 will give the same result and carry out as ror by n-32; therefore repeatedly subtract 32 from n until the amount is in the range 1 to 32 and see above. note the zero in bit 7 of an instruction with a register controlled shift is compulsory; a one in this bit will cause the instruction to be a multiply or undetned instruction. 10 contents of rm value of operand 2 carry out c in 31 10 contents of rm value of operand 2 carry out
instruction set - teq, tst, cmp, cmn arm610 data sheet 4-15 open access 4.4.3 immediate operand rotates the immediate operand rotate teld is a 4-bit unsigned integer which specites a shift operation on the 8-bit immediate value. this value is zero extended to 32 bits, and then subject to a rotate right by twice the value in the rotate teld. this enables many common constants to be generated, for example all powers of 2. 4.4.4 writing to r15 when rd is a register other than r15, the condition code ?ags in the cpsr may be updated from the alu ?ags as described above. when rd is r15 and the s ?ag in the instruction is not set the result of the operation is placed in r15 and the cpsr is unaffected. when rd is r15 and the s ?ag is set the result of the operation is placed in r15 and the spsr corresponding to the current mode is moved to the cpsr. this allows state changes which atomically restore both pc and cpsr. this form of instruction should not be used in user mode. 4.4.5 using r15 as an operand if r15 (the pc) is used as an operand in a data processing instruction the register is used directly. the pc value will be the address of the instruction, plus 8 or 12 bytes due to instruction prefetching. if the shift amount is specited in the instruction, the pc will be 8 bytes ahead. if a register is used to specify the shift amount the pc will be 12 bytes ahead. 4.4.6 teq, tst, cmp and cmn opcodes note teq, tst, cmp and cmn do not write the result of their operation but do set flags in the cpsr. an assembler should always set the s flag for these instructions even if this is not specified in the mnemonic. the teqp form of the teq instruction used in earlier arm processors must not be used: the psr transfer operations should be used instead. the action of teqp in the arm610 is to move spsr_ to the cpsr if the processor is in a privileged mode and to do nothing if in user mode. 4.4.7 assembler syntax 1 mov,mvn (single operand instructions.) {cond}{s} rd, 2 cmp,cmn,teq,tst (instructions which do not produce a result.) {cond} rn, 3 and,eor,sub,rsb,add,adc,sbc,rsc,orr,bic {cond}{s} rd,rn,
instruction set - teq, tst, cmp, cmn arm610 data sheet 4-16 open access where: is rm{,} or,<#expression> {cond} is a two-character condition mnemonic. see table 4-2: condition code summary on page 4-6. {s} set condition codes if s present (implied for cmp, cmn, teq, tst). rd, rn and rm are expressions evaluating to a register number. <#expression> if this is used, the assembler will attempt to generate a shifted immediate 8-bit teld to match the expression. if this is impossible, it will give an error. is or #expression, or rrx (rotate right one bit with extend). s are: asl, lsl, lsr, asr, ror. (asl is a synonym for lsl, they assemble to the same code.) 4.4.8 examples addeq r2,r4,r5 ; if the z flag is set make r2:=r4+r5 teqs r4,#3 ; test r4 for equality with 3. ; (the s is in fact redundant as the ; assembler inserts it automatically.) sub r4,r5,r7,lsr r2; logical right shift r7 by the number in ; the bottom byte of r2, subtract result ; from r5, and put the answer into r4. mov pc,r14 ; return from subroutine. movs pc,r14 ; return from exception and restore cpsr ; from spsr_mode.
instruction set - mrs, msr arm610 data sheet 4-17 open access 4.5 psr transfer (mrs, msr) the instruction is only executed if the condition is true. the various conditions are detned in table 4-2: condition code summary on page 4-6. the mrs and msr instructions are formed from a subset of the data processing operations and are implemented using the teq, tst, cmn and cmp instructions without the s ?ag set. the encoding is shown in figure 4-10: psr transfer on page 4-18. these instructions allow access to the cpsr and spsr registers. the mrs instruction allows the contents of the cpsr or spsr_ to be moved to a general register. the msr instruction allows the contents of a general register to be moved to the cpsr or spsr_ register. the msr instruction also allows an immediate value or register contents to be transferred to the condition code ?ags (n,z,c and v) of cpsr or spsr_ without affecting the control bits. in this case, the top four bits of the specited register contents or 32-bit immediate value are written to the top four bits of the relevant psr. 4.5.1 operand restrictions in user mode, the control bits of the cpsr are protected from change, so only the condition code flags of the cpsr can be changed. in other (privileged) modes the entire cpsr can be changed. note that the software must never change the state of the t bit in the cpsr. if this happens, the processor will enter an unpredictable state. the spsr register which is accessed depends on the mode at the time of execution. for example, only spsr_fiq is accessible when the processor is in fiq mode. you must not specify r15 as the source or destination register. also, do not attempt to access an spsr in user mode, since no such register exists.
instruction set - mrs, msr arm610 data sheet 4-18 open access figure 4-10: psr transfer 11 8 7 0 rotate imm 11 4 3 0 00000000 rm 28 27 23 22 21 12 11 0 n d 00 i 10 p d 1010001111 source operand 28 27 23 22 21 12 11 4 3 0 n d 00010 p d 1010011111 00000000 rm 28 27 23 22 21 16 15 12 11 0 n d 00010 p s 001111 rd 000000000000 m rs (transfer psr contents to a register) destination register source psr condition field 0=cpsr 1=spsr_ sr (transfer register contents to psr) source register destination psr condition field 0=cpsr 1=spsr_ m sr (transfer register contents or immediate value to psr flag bits only) destination psr immediate operand 0=cpsr 1=spsr_ 0=source operand is a register 1=source operand is an immediate value condition field source register unsigned 8-bit immediate value shift applied to imm 11 8 7 0 rotate imm 11 4 3 0 00000000 rm 31 28 27 23 22 21 12 11 0 cond 00 i 10 p d 1010001111 source operand 31 28 27 23 22 21 12 11 4 3 0 cond 00010 p d 1010011111 00000000 rm 31 28 27 23 22 21 16 15 12 11 0 cond 00010 p s 001111 rd 000000000000 mrs (transfer psr contents to a register) destination register source psr condition field 0=cpsr 1=spsr_ msr (transfer register contents to psr) source register destination psr condition field 0=cpsr 1=spsr_ msr (transfer register contents or immediate value to psr flag bits only) destination psr immediate operand 0=cpsr 1=spsr_ 0=source operand is a register 1=source operand is an immediate value condition field source register unsigned 8-bit immediate value shift applied to imm
instruction set - mrs, msr arm610 data sheet 4-19 open access 4.5.2 reserved bits only twelve bits of the psr are detned in arm610 (n,z,c,v,i,f, t & m[4:0]); the remaining bits are reserved for use in future versions of the processor. refer to figure 3-6: program status register format on page 3-8 for a full description of the psr bits. to ensure the maximum compatibility between arm610 programs and future processors, the following rules should be observed: the reserved bits should be preserved when changing the value in a psr. programs should not rely on specific values from the reserved bits when checking the psr status, since they may read as one or zero in future processors. a read-modify-write strategy should therefore be used when altering the control bits of any psr register; this involves transferring the appropriate psr register to a general register using the mrs instruction, changing only the relevant bits and then transferring the modited value back to the psr register using the msr instruction. example the following sequence performs a mode change: mrs r0,cpsr ; take a copy of the cpsr. bic r0,r0,#0x1f ; clear the mode bits. orr r0,r0,#new_mode ; select new mode msr cpsr,r0 ; write back the modified ; cpsr. when the aim is simply to change the condition code ?gs in a psr, a value can be written directly to the ?g bits without disturbing the control bits. the following instruction sets the n,z,c and v ?gs: msr cpsr_flg,#0xf0000000 ; set all the flags ; regardless of their ; previous state (does not ; affect any control bits). no attempt should be made to write an 8-bit immediate value into the whole psr since such an operation cannot preserve the reserved bits.
instruction set - mrs, msr arm610 data sheet 4-20 open access 4.5.3 assembler syntax 1 mrs - transfer psr contents to a register mrs{cond} rd, 2 msr - transfer register contents to psr msr{cond} ,rm 3 msr - transfer register contents to psr ?g bits only msr{cond} ,rm the most significant four bits of the register contents are written to the n,z,c & v flags respectively. 4 msr - transfer immediate value to psr ?g bits only msr{cond} ,<#expression> the expression should symbolise a 32-bit value of which the most significant four bits are written to the n,z,c and v flags respectively. key {cond} two-character condition mnemonic. see table 4-2: condition code summary on page 4-6. rd and rm are expressions evaluating to a register number other than r15 is cpsr, cpsr_all, spsr or spsr_all. (cpsr and cpsr_all are synonyms as are spsr and spsr_all) is cpsr_?g or spsr_?g <#expression> where this is used, the assembler will attempt to generate a shifted immediate 8-bit teld to match the expression. if this is impossible, it will give an error.
instruction set - mrs, msr arm610 data sheet 4-21 open access 4.5.4 examples in user mode the instructions behave as follows: msr cpsr_all,rm ; cpsr[31:28] <- rm[31:28] msr cpsr_flg,rm ; cpsr[31:28] <- rm[31:28] msr cpsr_flg,#0xa0000000 ; cpsr[31:28] <- 0xa ;(set n,c; clear z,v) mrs rd,cpsr ; rd[31:0] <- cpsr[31:0] in privileged modes the instructions behave as follows: msr cpsr_all,rm ; cpsr[31:0] <- rm[31:0] msr cpsr_flg,rm ; cpsr[31:28] <- rm[31:28] msr cpsr_flg,#0x50000000 ; cpsr[31:28] <- 0x5 ;(set z,v; clear n,c) mrs rd,cpsr ; rd[31:0] <- cpsr[31:0] msr spsr_all,rm ;spsr_[31:0]<- rm[31:0] msr spsr_flg,rm ; spsr_[31:28] <- rm[31:28] msr spsr_flg,#0xc0000000 ; spsr_[31:28] <- 0xc ;(set n,z; clear c,v) mrs rd,spsr ; rd[31:0] <- spsr_[31:0]
instruction set - mul, mla arm610 data sheet 4-22 open access 4.6 multiply and multiply-accumulate (mul, mla) the instruction is only executed if the condition is true. the various conditions are detned in table 4-2: condition code summary on page 4-6. the instruction encoding is shown in figure 4-11: multiply instructions . the multiply and multiply-accumulate instructions use an 8-bit booth's algorithm to perform integer multiplication. figure 4-11: multiply instructions the multiply form of the instruction gives rd:=rm*rs. rn is ignored, and should be set to zero for compatibility with possible future upgrades to the instruction set. the multiply-accumulate form gives rd:=rm*rs+rn, which can save an explicit add instruction in some circumstances. both forms of the instruction work on operands which may be considered as signed (two?s complement) or unsigned integers. the results of a signed multiply and of an unsigned multiply of 32-bit operands differ only in the upper 32 bits - the low 32 bits of the signed and unsigned results are identical. as these instructions only produce the low 32 bits of a multiply, they can be used for both signed and unsigned multiplies. for example consider the multiplication of the operands: operand a operand b result 0xfffffff6 0x0000001 0xffffff38 if the operands are interpreted as signed operand a has the value -10, operand b has the value 20, and the result is -200 which is correctly represented as 0xffffff38 if the operands are interpreted as unsigned operand a has the value 4294967286, operand b has the value 20 and the result is 85899345720, which is represented as 0x13ffffff38, so the least signitcant 32 bits are 0xffffff38. 31 28 27 22 21 20 19 16 15 12 11 8 7 4 3 0 cond 0 0 0000as rd rn rs 1001 rm operand registers destination register set condition code accumulate 0 = do not alter condition codes 1 = set condition codes 0 = multiply only 1 = multiply and accumulate condition field
instruction set - mul, mla arm610 data sheet 4-23 open access 4.6.1 operand restrictions the destination register rd must not be the same as the operand register rm. r15 must not be used as an operand or as the destination register. all other register combinations will give correct results, and rd, rn and rs may use the same register when required. 4.6.2 cpsr ?ags setting the cpsr ?ags is optional, and is controlled by the s bit in the instruction. the n (negative) and z (zero) ?ags are set correctly on the result (n is made equal to bit 31 of the result, and z is set if and only if the result is zero). the c (carry) ?ag is set to a meaningless value and the v (over?ow) ?ag is unaffected. 4.6.3 assembler syntax mul{cond}{s} rd,rm,rs mla{cond}{s} rd,rm,rs,rn {cond} two-character condition mnemonic. see table 4-2: condition code summary on page 4-6. {s} set condition codes if s present rd, rm, rs and rn are expressions evaluating to a register number other than r15. 4.6.4 examples mul r1,r2,r3 ; r1:=r2*r3 mlaeqs r1,r2,r3,r4 ; conditionally r1:=r2*r3+r4, ; setting condition codes.
instruction set - ldr, str arm610 data sheet 4-24 open access 4.7 single data transfer (ldr, str) the instruction is only executed if the condition is true. the various conditions are detned in table 4-2: condition code summary on page 4-6. the instruction encoding is shown in figure 4-12: single data transfer instructions on page 4-24. the single data transfer instructions are used to load or store single bytes or words of data. the memory address used in the transfer is calculated by adding an offset to or subtracting an offset from a base register. the result of this calculation may be written back into the base register if auto-indexing is required. figure 4-12: single data transfer instructions 8 27 26 25 24 23 22 21 20 19 16 15 12 11 0 01 i p u b w l rn rd offset source/destination register base register load/store bit 0 = store to memory 1 = load from memory write-back bit 0 = no write-back 1 = write address into space byte/word bit 0 = transfer word quantity 1 = transfer byte quantity up/down bit 0 = down; subtract offset from base 1 = up; add offset from base pre/post indexing bit 0 = post; add offset after transfer 1 = pre; add offset before transfer immediate offset 0 = offset is an immediate value 11 0 immediate offset unsigned 12-bit immediate offset shift rm 1 = offset is a register 11 4 3 0 shift rm shift applied to rm offset register condition field 31 28 27 26 25 24 23 22 21 20 19 16 15 12 11 0 cond 01 i p u b w l rn rd offset source/destination register base register load/store bit 0 = store to memory 1 = load from memory write-back bit 0 = no write-back 1 = write address into space byte/word bit 0 = transfer word quantity 1 = transfer byte quantity up/down bit 0 = down; subtract offset from base 1 = up; add offset from base pre/post indexing bit 0 = post; add offset after transfer 1 = pre; add offset before transfer immediate offset 0 = offset is an immediate value 11 0 immediate offset unsigned 12-bit immediate offset shift rm 1 = offset is a register 11 4 3 0 shift rm shift applied to rm offset register condition field
instruction set - ldr, str arm610 data sheet 4-25 open access 4.7.1 offsets and auto-indexing the offset from the base may be either a 12-bit unsigned binary immediate value in the instruction, or a second register (possibly shifted in some way). the offset may be added to (u=1) or subtracted from (u=0) the base register rn. the offset moditcation may be performed either before (pre-indexed, p=1) or after (post-indexed, p=0) the base is used as the transfer address. the w bit gives optional auto increment and decrement addressing modes. the modited base value may be written back into the base (w=1), or the old base value may be kept (w=0). in the case of post-indexed addressing, the write back bit is redundant and is always set to zero, since the old base value can be retained by setting the offset to zero. therefore post-indexed data transfers always write back the modited base. the only use of the w bit in a post-indexed data transfer is in privileged mode code, where setting the w bit forces non-privileged mode for the transfer, allowing the operating system to generate a user address in a system where the memory management hardware makes suitable use of this hardware. 4.7.2 shifted register offset the 8 shift control bits are described in the data processing instructions section. however, the register specited shift amounts are not available in this instruction class. see 4.4.2 shifts on page 4-11. 4.7.3 bytes and words this instruction class may be used to transfer a byte (b=1) or a word (b=0) between an arm610 register and memory. the action of ldr(b) and str(b) instructions is in?uenced by the bigend control signal. the two possible contgurations are described below. little-endian contguration a byte load (ldrb) expects the data on data bus inputs 7 through 0 if the supplied address is on a word boundary, on data bus inputs 15 through 8 if it is a word address plus one byte, and so on. the selected byte is placed in the bottom 8 bits of the destination register, and the remaining bits of the register are tlled with zeros. please see figure 3-2: little endian addresses of bytes within words on page 3-3. a byte store (strb) repeats the bottom 8 bits of the source register four times across data bus outputs 31 through 0. the external memory system should activate the appropriate byte subsystem to store the data. a word load (ldr) will normally use a word aligned address. however, an address offset from a word boundary will cause the data to be rotated into the register so that the addressed byte occupies bits 0 to 7. this means that halfwords accessed at offsets 0 and 2 from the word boundary will be correctly loaded into bits 0 through 15 of the register. two shift operations are then required to clear or to sign extend the upper 16 bits. this is illustrated in figure 4-13: little-endian offset addressing on page 4-26.
instruction set - ldr, str arm610 data sheet 4-26 open access figure 4-13: little-endian offset addressing a word store (str) should generate a word aligned address. the word presented to the data bus is not affected if the address is not word aligned. that is, bit 31 of the register being stored always appears on data bus output 31. big-endian contguration a byte load (ldrb) expects the data on data bus inputs 31 through 24 if the supplied address is on a word boundary, on data bus inputs 23 through 16 if it is a word address plus one byte, and so on. the selected byte is placed in the bottom 8 bits of the destination register and the remaining bits of the register are tlled with zeros. please see figure 3-1: big endian addresses of bytes within words on page 3-3. a byte store (strb) repeats the bottom 8 bits of the source register four times across data bus outputs 31 through 0. the external memory system should activate the appropriate byte subsystem to store the data. a word load (ldr) should generate a word aligned address. an address offset of 0 or 2 from a word boundary will cause the data to be rotated into the register so that the addressed byte occupies bits 31 through 24. this means that halfwords accessed at these offsets will be correctly loaded into bits 16 through 31 of the register. a shift operation is then required to move (and optionally sign extend) the data into the bottom 16 bits. an address offset of 1 or 3 from a word boundary will cause the data to be rotated into the register so that the addressed byte occupies bits 15 through 8. a b c d memory a+3 a+2 a+1 a 24 16 8 0 a b c d register 24 16 8 0 ldr from word aligned address a b c d a+3 a+2 a+1 a 24 16 8 0 a b c d 24 16 8 0 ldr from address offset by 2
instruction set - ldr, str arm610 data sheet 4-27 open access a word store (str) should generate a word aligned address. the word presented to the data bus is not affected if the address is not word aligned. that is, bit 31 of the register being stored always appears on data bus output 31. 4.7.4 use of r15 write-back must not be specited if r15 is specited as the base register (rn). when using r15 as the base register you must remember it contains an address 8 bytes on from the address of the current instruction. r15 must not be specited as the register offset (rm). when r15 is the source register (rd) of a register store (str) instruction, the stored value will be address of the instruction plus 12. 4.7.5 restriction on the use of base register when contgured for late aborts, the following example code is diftcult to unwind as the base register, rn, gets updated before the abort handler starts. sometimes it may be impossible to calculate the initial value. after an abort, the following example code is diftcult to unwind as the base register, rn, gets updated before the abort handler starts. sometimes it may be impossible to calculate the initial value. example: ldr r0,[r1],r1 therefore a post-indexed ldr or str where rm is the same register as rn should not be used. 4.7.6 data aborts a transfer to or from a legal address may cause problems for a memory management system. for instance, in a system which uses virtual memory the required data may be absent from main memory. the memory manager can signal a problem by taking the processor abort input high whereupon the data abort trap will be taken. it is up to the system software to resolve the cause of the problem, then the instruction can be restarted and the original program continued. 4.7.7 instruction cycle times normal ldr instructions take 1s + 1n + 1i and ldr pc take 2s + 2n +1i incremental cycles, where s,n and i are as de?ed in 6.2 cycle types on page 6-2. str instructions take 2n incremental cycles to execute.
instruction set - ldr, str arm610 data sheet 4-28 open access 4.7.8 assembler syntax {cond}{b}{t} rd,
where: ldr load from memory into a register str store from a register into memory {cond} two-character condition mnemonic. see table 4-2: condition code summary on page 4-6. {b} if b is present then byte transfer, otherwise word transfer {t} if t is present the w bit will be set in a post-indexed instruction, forcing non-privileged mode for the transfer cycle. t is not allowed when a pre-indexed addressing mode is specited or implied. rd is an expression evaluating to a valid register number. rn and rm are expressions evaluating to a register number. if rn is r15 then the assembler will subtract 8 from the offset value to allow for arm610 pipelining. in this case base write-back should not be specited.
can be: 1 an expression which generates an address: the assembler will attempt to generate an instruction using the pc as a base and a corrected immediate offset to address the location given by evaluating the expression. this will be a pc relative, pre-indexed address. if the address is out of range, an error will be generated. 2 a pre-indexed addressing speci?ation: [rn] offset of zero [rn,<#expression>]{!} offset of bytes [rn,{+/-}rm{,}]{!} offset of +/- contents of index register, shifted by 3 a post-indexed addressing speci?ation: [rn],<#expression> offset of bytes [rn],{+/-}rm{,} offset of +/- contents of index register, shifted as by
instruction set - ldr, str arm610 data sheet 4-29 open access general shift operation (see data processing instructions) but you cannot specify the shift amount by a register. {!} writes back the base register (set the w bit) if! is present. 4.7.9 examples str r1,[r2,r4]! ; store r1 at r2+r4 (both of which are ; registers) and write back address to ; r2. str r1,[r2],r4 ; store r1 at r2 and write back ; r2+r4 to r2. ldr r1,[r2,#16] ; load r1 from contents of r2+16, but ; don't write back. ldr r1,[r2,r3,lsl#2] ; load r1 from contents of r2+r3*4. ldreqbr1,[r6,#5] ; conditionally load byte at r6+5 into ; r1 bits 0 to 7, filling bits 8 to 31 ; with zeros. str r1,place ; generate pc relative offset to ; address place. place
instruction set - ldr, str arm610 data sheet 4-30 open access 4.8 halfword and signed data transfer (ldrh/strh/ldrsb/ldrsh) the instruction is only executed if the condition is true. the various conditions are detned in table 4-2: condition code summary on page 4-6. the instruction encoding is shown in figure 4-14: halfword and signed data transfer with register offset , below, and figure 4-15: halfword and signed data transfer with immediate offset on page 4-31. these instructions are used to load or store halfwords of data and also load sign-extended bytes or halfwords of data. the memory address used in the transfer is calculated by adding an offset to or subtracting an offset from a base register. the result of this calculation may be written back into the base register if auto-indexing is required. figure 4-14: halfword and signed data transfer with register offset 31 2827 25242322212019 1615 1211 876543 0 cond 00010 p u 0 w l rn rd 00001sh1 rm offset register s h 00 = swp instruction 01 = unsigned halfwords 10 = signed byte 11 = signed halfwords source/destination register base register load/store 0 = store to memory 1 = load from memory write-back 0 = no write-back 1 = write address into base up/down 0 = down: subtract offset from base 1 = up: add offset to base pre/post indexing 0 = post: add/subtract offset after transfer 1 = pre: add/subtract offset before transfer condition field
instruction set - ldr, str arm610 data sheet 4-31 open access figure 4-15: halfword and signed data transfer with immediate offset 4.8.1 offsets and auto-indexing the offset from the base may be either a 8-bit unsigned binary immediate value in the instruction, or a second register. the 8-bit offset is formed by concatenating bits 11 to 8 and bits 3 to 0 of the instruction word, such that bit 11 becomes the msb and bit 0 becomes the lsb. the offset may be added to (u=1) or subtracted from (u=0) the base register rn. the offset moditcation may be performed either before (pre-indexed, p=1) or after (post-indexed, p=0) the base register is used as the transfer address. the w bit gives optional auto-increment and decrement addressing modes. the modited base value may be written back into the base (w=1), or the old base may be kept (w=0). in the case of post-indexed addressing, the write back bit is redundant and is always set to zero, since the old base value can be retained if necessary by setting the offset to zero. therefore post-indexed data transfers always write back the modited base. the write-back bit should not be set high (w=1) when post-indexed addressing is selected. immediate offset (low nibble) s h 00 = swp instruction 01 = unsigned halfwords 10 = signed byte 11 = signed halfwords immediate offset (high nibble) source/destination register base register load/store 0 = store to memory 1 = load from memory write-back 0 = no write-back 1 = write address into base up/down 0 = down: subtract offset from base 1 = up: add offset to base pre/post indexing 0 = post: add/subtract offset after transfer 1 = pre: add/subtract offset before transfer 31 2827 25242322212019 1615 1211 876543 0 cond 0 0 0 p u 1 w l rn rd offset 1 s h 1 offset
instruction set - ldr, str arm610 data sheet 4-32 open access 4.8.2 halfword load and stores setting s=0 and h=1 may be used to transfer unsigned halfwords between an arm610 register and memory. the action of ldrh and strh instructions is in?uenced by the bigend control signal. the two possible contgurations are described in the section below. 4.8.3 signed byte and halfword loads the s bit controls the loading of sign-extended data. when s=1 the h bit selects between bytes (h=0) and halfwords (h=1). the l bit should not be set low (store) when signed (s=1) operations have been selected. the ldrsb instruction loads the selected byte into bits 7 to 0 of the destination register and bits 31 to 8 of the destination register are set to the value of bit 7, the sign bit. the ldrsh instruction loads the selected halfword into bits 15 to 0 of the destination register and bits 31 to 16 of the destination register are set to the value of bit 15, the sign bit. the action of the ldrsb and ldrsh instructions is in?uenced by the bigend control signal. the two possible contgurations are described in the following section. 4.8.4 endianness and byte/halfword selection little-endian contguration a signed byte load (ldrsb) expects data on data bus inputs 7 through to 0 if the supplied address is on a word boundary, on data bus inputs 15 through to 8 if it is a word address plus one byte, and so on. the selected byte is placed in the bottom 8 bit of the destination register, and the remaining bits of the register are tlled with the sign bit, bit 7 of the byte. please see figure 3-2: little endian addresses of bytes within words on page 3-3 a halfword load (ldrsh or ldrh) expects data on data bus inputs 15 through to 0 if the supplied address is on a word boundary and on data bus inputs 31 through to 16 if it is a halfword boundary, (a[1]=1).the supplied address should always be on a halfword boundary. if bit 0 of the supplied address is high, the arm610 will load an unpredictable value. the selected halfword is placed in the bottom 16 bits of the destination register. for unsigned halfwords (ldrh), the top 16 bits of the register are tlled with zeros and for signed halfwords (ldrsh) the top 16 bits are tlled with the sign bit, bit 15 of the halfword. a halfword store (strh) repeats the bottom 16 bits of the source register twice across the data bus outputs 31 through to 0. the external memory system should activate the appropriate halfword subsystem to store the data. note that the address must be halfword aligned, if bit 0 of the address is high this will cause unpredictable behaviour.
instruction set - ldr, str arm610 data sheet 4-33 open access big-endian contguration a signed byte load (ldrsb) expects data on data bus inputs 31 through to 24 if the supplied address is on a word boundary, on data bus inputs 23 through to 16 if it is a word address plus one byte, and so on. the selected byte is placed in the bottom 8 bits of the destination register, and the remaining bits of the register are tlled with the sign bit, bit 7 of the byte. please see figure 3-1: big endian addresses of bytes within words on page 3-3 a halfword load (ldrsh or ldrh) expects data on data bus inputs 31 through to 16 if the supplied address is on a word boundary and on data bus inputs 15 through to 0 if it is a halfword boundary, (a[1]=1). the supplied address should always be on a halfword boundary. if bit 0 of the supplied address is high then the arm610 will load an unpredictable value. the selected halfword is placed in the bottom 16 bits of the destination register. for unsigned halfwords (ldrh), the top 16 bits of the register are tlled with zeros and for signed halfwords (ldrsh) the top 16 bits are tlled with the sign bit, bit 15 of the halfword. a halfword store (strh) repeats the bottom 16 bits of the source register twice across the data bus outputs 31 through to 0. the external memory system should activate the appropriate halfword subsystem to store the data. note that the address must be halfword aligned, if bit 0 of the address is high this will cause unpredictable behaviour. 4.8.5 use of r15 writeback should not be specited if r15 is specited as the base register (rn). when using r15 as the base register you must remember it contains an address 8 bytes on from the address of the current instruction. r15 should not be specited as the register offset (rm). when r15 is the source register (rd) of a halfword store (strh) instruction, the stored address will be address of the instruction plus 12. 4.8.6 data aborts a transfer to or from a legal address may cause problems for a memory management system. for instance, in a system which uses virtual memory the required data may be absent from the main memory. the memory manager can signal a problem by taking the processor abort input high whereupon the data abort trap will be taken. it is up to the system software to resolve the cause of the problem, then the instruction can be restarted and the original program continued. 4.8.7 instruction cycle times normal ldr(h,sh,sb) instructions take 1s + 1n + 1i ldr(h,sh,sb) pc take 2s + 2n + 1i incremental cycles. s,n and i are detned in 6.2 cycle types on page 6-2 . strh instructions take 2n incremental cycles to execute.
instruction set - ldr, str arm610 data sheet 4-34 open access 4.8.8 assembler syntax {cond} rd,
ldr load from memory into a register str store from a register into memory {cond} two-character condition mnemonic. see table 4-2: condition code summary on page 4-6. h transfer halfword quantity sb load sign extended byte (only valid for ldr) sh load sign extended halfword (only valid for ldr) rd is an expression evaluating to a valid register number.
can be: 1 an expression which generates an address: the assembler will attempt to generate an instruction using the pc as a base and a corrected immediate offset to address the location given by evaluating the expression. this will be a pc relative, pre-indexed address. if the address is out of range, an error will be generated. 2 a pre-indexed addressing speci?ation: [rn] offset of zero [rn,<#expression>]{!} offset of bytes [rn,{+/-}rm]{!} offset of +/- contents of index register 3 a post-indexed addressing speci?ation: [rn],<#expression> offset of bytes [rn],{+/-}rm offset of +/- contents of index register. rn and rm are expressions evaluating to a register number. if rn is r15 then the assembler will subtract 8 from the offset value to allow for arm610 pipelining. in this case base write- back should not be speci?d. {!} writes back the base register (set the w bit) if ! is present.
instruction set - ldr, str arm610 data sheet 4-35 open access 4.8.9 examples ldrh r1,[r2,-r3]! ; load r1 from the contents of the ; halfword address contained in ; r2-r3 (both of which are registers) ; and write back address to r2 strh r3,[r4,#14] ; store the halfword in r3 at r14+14 ; but don't write back. ldrsb r8,[r2],#-223 ; load r8 with the sign extended ; contents of the byte address ; contained in r2 and write back ; r2-223 to r2. ldrnesh r11,[r0] ; conditionally load r11 with the sign ; extended contents of the halfword ; address contained in r0. here ; generate pc relative offset to ; address fred. ; store the halfword in r5 at address ; fred. strh r5, [pc, #(fred-here-8)] . fred
instruction set - ldm, stm arm610 data sheet 4-36 open access 4.9 block data transfer (ldm, stm) the instruction is only executed if the condition is true. the various conditions are detned in table 4-2: condition code summary on page 4-6. the instruction encoding is shown in figure 4-16: block data transfer instructions . block data transfer instructions are used to load (ldm) or store (stm) any subset of the currently visible registers. they support all possible stacking modes, maintaining full or empty stacks which can grow up or down memory, and are very eftcient instructions for saving or restoring context, or for moving large blocks of data around main memory. 4.9.1 the register list the instruction can cause the transfer of any registers in the current bank (and non-user mode programs can also transfer to and from the user bank, see below). the register list is a 16-bit teld in the instruction, with each bit corresponding to a register. a 1 in bit 0 of the register teld will cause r0 to be transferred, a 0 will cause it not to be transferred; similarly bit 1 controls the transfer of r1, and so on. any subset of the registers, or all the registers, may be specited. the only restriction is that the register list should not be empty. whenever r15 is stored to memory the stored value is the address of the stm instruction plus 12. figure 4-16: block data transfer instructions 31 28 27 25 24 23 22 21 20 19 16 15 0 cond 100 p u s w l rn register list base register load/store bit 0 = store to memory 1 = load from memory write-back bit 0 = no write-back 1 = write address into base psr and force user bit 0 = do not load psr or force user mode 1 = load psr or force user mode up/down bit 0 = down; subtract offset from base 1 = up; add offset from base pre/post indexing bit 0 = post; add offset after transfer 1 = pre; add offset before transfer condition field
instruction set - ldm, stm arm610 data sheet 4-37 open access 4.9.2 addressing modes the transfer addresses are determined by the contents of the base register (rn), the pre/post bit (p) and the up/down bit (u). the registers are transferred in the order lowest to highest, so r15 (if in the list) will always be transferred last. the lowest register also gets transferred to/from the lowest memory address. by way of illustration, consider the transfer of r1, r5 and r7 in the case where rn=0x1000 and write back of the modited base is required (w=1). figure 4-17: post-increment addressing , figure 4-18: pre-increment addressing , figure 4-19: post-decrement addressing and figure 4-20: pre-decrement addressing show the sequence of register transfers, the addresses used, and the value of rn after the instruction has completed. in all cases, had write back of the modited base not been required (w=0), rn would have retained its initial value of 0x1000 unless it was also in the transfer list of a load multiple register instruction, when it would have been overwritten with the loaded value. 4.9.3 address alignment the address should normally be a word aligned quantity and non-word aligned addresses do not affect the instruction. however, the bottom 2 bits of the address will appear on a[1:0] and might be interpreted by the memory system. figure 4-17: post-increment addressing 0x100c 0x1000 0x0ff4 rn 1 0x100c 0x1000 0x0ff4 2 r1 0x100c 0x1000 0x0ff4 3 0x100c 0x1000 0x0ff4 4 r1 r7 r5 r1 r5 rn
instruction set - ldm, stm arm610 data sheet 4-38 open access figure 4-18: pre-increment addressing figure 4-19: post-decrement addressing 0x100c 0x1000 0x0ff4 rn 1 0x100c 0x1000 0x0ff4 2 r1 0x100c 0x1000 0x0ff4 3 0x100c 0x1000 0x0ff4 4 r1 r7 r5 r1 r5 rn 0x100c 0x1000 0x0ff4 rn 1 0x100c 0x1000 0x0ff4 2 r1 0x100c 0x1000 0x0ff4 3 0x100c 0x1000 0x0ff4 4 r1 r7 r5 r1 r5 rn
instruction set - ldm, stm arm610 data sheet 4-39 open access figure 4-20: pre-decrement addressing 4.9.4 use of the s bit when the s bit is set in a ldm/stm instruction its meaning depends on whether or not r15 is in the transfer list and on the type of instruction. the s bit should only be set if the instruction is to execute in a privileged mode. ldm with r15 in transfer list and s bit set (mode changes) if the instruction is a ldm then spsr_ is transferred to cpsr at the same time as r15 is loaded. stm with r15 in transfer list and s bit set (user bank transfer) the registers transferred are taken from the user bank rather than the bank corresponding to the current mode. this is useful for saving the user state on process switches. base write-back should not be used when this mechanism is employed. r15 not in list and s bit set (user bank transfer) for both ldm and stm instructions, the user bank registers are transferred rather than the register bank corresponding to the current mode. this is useful for saving the user state on process switches. base write-back should not be used when this mechanism is employed. when the instruction is ldm, care must be taken not to read from a banked register during the following cycle (inserting a dummy instruction such as mov r0, r0 after the ldm will ensure safety). 0x100c 0x1000 0x0ff4 rn 1 0x100c 0x1000 0x0ff4 2 r1 0x100c 0x1000 0x0ff4 3 0x100c 0x1000 0x0ff4 4 r1 r7 r5 r1 r5 rn
instruction set - ldm, stm arm610 data sheet 4-40 open access 4.9.5 use of r15 as the base r15 should not be used as the base register in any ldm or stm instruction. 4.9.6 inclusion of the base in the register list when write-back is specited, the base is written back at the end of the second cycle of the instruction. during a stm, the trst register is written out at the start of the second cycle. a stm which includes storing the base, with the base as the trst register to be stored, will therefore store the unchanged value, whereas with the base second or later in the transfer order, will store the modited value. a ldm will always overwrite the updated base if the base is in the list. 4.9.7 data aborts some legal addresses may be unacceptable to a memory management system, and the memory manager can indicate a problem with an address by taking the abort signal high. this can happen on any transfer during a multiple register load or store, and must be recoverable if arm610 is to be used in a virtual memory system. aborts during stm instructions if the abort occurs during a store multiple instruction, arm610 takes little action until the instruction completes, whereupon it enters the data abort trap. the memory manager is responsible for preventing erroneous writes to the memory. the only change to the internal state of the processor will be the moditcation of the base register if write-back was specited, and this must be reversed by software (and the cause of the abort resolved) before the instruction may be retried. aborts during ldm instructions when arm610 detects a data abort during a load multiple instruction, it modites the operation of the instruction to ensure that recovery is possible. 1 overwriting of registers stops when the abort happens. the aborting load will not take place but earlier ones may have overwritten registers. the pc is always the last register to be written and so will always be preserved. 2 the base register is restored, to its modited value if write-back was requested. this ensures recoverability in the case where the base register is also in the transfer list, and may have been overwritten before the abort occurred. the data abort trap is taken when the load multiple has completed, and the system software must undo any base moditcation (and resolve the cause of the abort) before restarting the instruction.
instruction set - ldm, stm arm610 data sheet 4-41 open access 4.9.8 assembler syntax {cond} rn{!},{^} where: {cond} two character condition mnemonic. see table 4-2: condition code summary on page 4-6. rn is an expression evaluating to a valid register number is a list of registers and register ranges enclosed in {} (e.g. {r0,r2- r7,r10}). {!} if present requests write-back (w=1), otherwise w=0 {^} if present set s bit to load the cpsr along with the pc, or force transfer of user bank when in privileged mode addressing mode names there are different assembler mnemonics for each of the addressing modes, depending on whether the instruction is being used to support stacks or for other purposes. the equivalence between the names and the values of the bits in the instruction are shown in the following table: fd, ed, fa, ea detne pre/post indexing and the up/down bit by reference to the form of stack required. the f and e refer to a full or empty stack, i.e. whether a pre-index has to be done (full) before storing to the stack. the a and d refer to whether the stack is ascending or descending. if ascending, a stm will go up and ldm down, if descending, vice-versa. ia, ib, da, db allow control when ldm/stm are not being used for stacks and simply mean increment after, increment before, decrement after, decrement before. name stack other l bit p bit u bit pre-increment load ldmed ldmib 1 1 1 post-increment load ldmfd ldmia 1 0 1 pre-decrement load ldmea ldmdb 1 1 0 post-decrement load ldmfa ldmda 1 0 0 pre-increment store stmfa stmib 0 1 1 post-increment store stmea stmia 0 0 1 pre-decrement store stmfd stmdb 0 1 0 post-decrement store stmed stmda 0 0 0 table 4-4: addressing mode names
instruction set - ldm, stm arm610 data sheet 4-42 open access 4.9.9 examples ldmfd sp!,{r0,r1,r2} ; unstack 3 registers. stmia r0,{r0-r15} ; save all registers. ldmfd sp!,{r15} ; r15 <- (sp),cpsr unchanged. ldmfd sp!,{r15}^ ; r15 <- (sp), cpsr <- spsr_mode ; (allowed only in privileged modes). stmfd r13,{r0-r14}^ ; save user mode regs on stack ; (allowed only in privileged modes). these instructions may be used to save state on subroutine entry, and restore it ef?iently on return to the calling routine: stmed sp!,{r0-r3,r14} ; save r0 to r3 to use as workspace ; and r14 for returning. bl somewhere ; this nested call will overwrite r14 ldmed sp!,{r0-r3,r15} ; restore workspace and return.
instruction set - swp arm610 data sheet 4-43 open access 4.10 single data swap (swp) figure 4-21: swap instruction the instruction is only executed if the condition is true. the various conditions are detned in table 4-2: condition code summary on page 4-6. the instruction encoding is shown in figure 4-21: swap instruction . the data swap instruction is used to swap a byte or word quantity between a register and external memory. this instruction is implemented as a memory read followed by a memory write which are locked together (the processor cannot be interrupted until both operations have completed, and the memory manager is warned to treat them as inseparable). this class of instruction is particularly useful for implementing software semaphores. the swap address is determined by the contents of the base register (rn). the processor trst reads the contents of the swap address. then it writes the contents of the source register (rm) to the swap address, and stores the old memory contents in the destination register (rd). the same register may be specited as both the source and destination. the lock output goes high for the duration of the read and write operations to signal to the external memory manager that they are locked together, and should be allowed to complete without interruption. this is important in multi-processor systems where the swap instruction is the only indivisible instruction which may be used to implement semaphores; control of the memory must not be removed from a processor while it is performing a locked operation. 4.10.1 bytes and words this instruction class may be used to swap a byte (b=1) or a word (b=0) between an arm610 register and memory. the swp instruction is implemented as a ldr followed by a str and the action of these is as described in the section on single data transfers. in particular, the description of big and little-endian contguration applies to the swp instruction. 31 28 27 23 22 21 20 19 16 15 12 11 8 7 4 3 0 cond 00010 b 00 rn rd 0000 1001 rm source register destination register base register byte/word bit 0 = swap word quantity 1 = swap byte quantity condition field
instruction set - swp arm610 data sheet 4-44 open access 4.10.2 use of r15 do not use r15 as an operand (rd, rn or rs) in a swp instruction. 4.10.3 data aborts if the address used for the swap is unacceptable to a memory management system, the memory manager can ?ag the problem by driving abort high. this can happen on either the read or the write cycle (or both), and in either case, the data abort trap will be taken. it is up to the system software to resolve the cause of the problem, then the instruction can be restarted and the original program continued. 4.10.4 instruction cycle times swap instructions take 1s + 2n +1i incremental cycles to execute, where s,n and i are as detned in 6.2 cycle types on page 6-2. 4.10.5 assembler syntax {cond}{b} rd,rm,[rn] {cond} two-character condition mnemonic. see table 4-2: condition code summary on page 4-6. {b} if b is present then byte transfer, otherwise word transfer rd,rm,rn are expressions evaluating to valid register numbers 4.10.6 examples swp r0,r1,[r2] ; load r0 with the word addressed by r2, and ; store r1 at r2. swpb r2,r3,[r4] ; load r2 with the byte addressed by r4, and ; store bits 0 to 7 of r3 at r4. swpeq r0,r0,[r1] ; conditionally swap the contents of the ; word addressed by r1 with r0.
instruction set - swi arm610 data sheet 4-45 open access 4.11 software interrupt (swi) the instruction is only executed if the condition is true. the various conditions are detned in table 4-2: condition code summary on page 4-6. the instruction encoding is shown in figure 4-22: software interrupt instruction , below. figure 4-22: software interrupt instruction the software interrupt instruction is used to enter supervisor mode in a controlled manner. the instruction causes the software interrupt trap to be taken, which effects the mode change. the pc is then forced to a txed value (0x08) and the cpsr is saved in spsr_svc. if the swi vector address is suitably protected (by external memory management hardware) from moditcation by the user, a fully protected operating system may be constructed. 4.11.1 return from the supervisor the pc is saved in r14_svc upon entering the software interrupt trap, with the pc adjusted to point to the word after the swi instruction. movs pc,r14_svc will return to the calling program and restore the cpsr. note that the link mechanism is not re-entrant, so if the supervisor code wishes to use software interrupts within itself it must trst save a copy of the return address and spsr. 4.11.2 comment teld the bottom 24 bits of the instruction are ignored by the processor, and may be used to communicate information to the supervisor code. for instance, the supervisor may look at this teld and use it to index into an array of entry points for routines which perform the various supervisor functions. 4.11.3 instruction cycle times software interrupt instructions take 2s + 1n incremental cycles to execute, where s and n are as detned in 6.2 cycle types on page 6-2. 31 28 27 24 23 0 cond 1111 comment field (ignored by processor) condition field
instruction set - swi arm610 data sheet 4-46 open access 4.11.4 assembler syntax swi{cond} {cond} two character condition mnemonic, table 4-2: condition code summary on page 4-6. is evaluated and placed in the comment teld (which is ignored by arm610). 4.11.5 examples swi readc ; get next character from read stream. swi writei+? ; output a ??to the write stream. swine 0 ; conditionally call supervisor ; with 0 in comment field. supervisor code the previous examples assume that suitable supervisor code exists, for instance: 0x08 b supervisor ; swi entry point entrytable ; addresses of supervisor routines dcd zerortn dcd readcrtn dcd writeirtn . . . zero equ 0 readc equ 256 writei equ 512 supervisor ; swi has routine required in bits 8-23 and data (if any) in ; bits 0-7. ; assumes r13_svc points to a suitable stack stmfd r13,{r0-r2,r14} ; save work registers and return ; address. ldr r0,[r14,#-4] ; get swi instruction. bic r0,r0,#0xff000000 ; clear top 8 bits. mov r1,r0,lsr#8 ; get routine offset. adr r2,entrytable ; get start address of entry table. ldr r15,[r2,r1,lsl#2] ; branch to appropriate routine. writeirtn ; enter with character in r0 bits 0-7. . . . . . . ldmfd r13,{r0-r2,r15}^ ; restore workspace and return, ; restoring processor mode and flags.
instruction set - cdp arm610 data sheet 4-47 open access 4.12 coprocessor data operations (cdp) the instruction is only executed if the condition is true. the various conditions are detned in table 4-2: condition code summary on page 4-6. the instruction encoding is shown in figure 4-23: coprocessor data operation instruction . this class of instruction is used to tell a coprocessor to perform some internal operation. no result is communicated back to arm610, and it will not wait for the operation to complete. the coprocessor could contain a queue of such instructions awaiting execution, and their execution can overlap other activity, allowing the coprocessor and arm610 to perform independent tasks in parallel. figure 4-23: coprocessor data operation instruction 4.12.1 the coprocessor telds only bit 4 and bits 24 to 31 are signitcant to arm610. the remaining bits are used by coprocessors. the above teld names are used by convention, and particular coprocessors may redetne the use of all telds except cp# as appropriate. the cp# teld is used to contain an identifying number (in the range 0 to 15) for each coprocessor, and a coprocessor will ignore any instruction which does not contain its number in the cp# teld. the conventional interpretation of the instruction is that the coprocessor should perform an operation specited in the cp opc teld (and possibly in the cp teld) on the contents of crn and crm, and place the result in crd. 4.12.2 instruction cycle times coprocessor data operations take 1s + bi incremental cycles to execute, where b is the number of cycles spent in the coprocessor busy-wait loop. s and i are as detned in 6.2 cycle types on page 6-2. 31 28 27 24 23 20 19 16 15 12 11 8 7 5 4 3 0 cond 1110 cp opc crn crd cp# cp 0 crm coprocessor operand register coprocessor information coprocessor number coprocessor destination register coprocessor operand register coprocessor operation code condition field
instruction set - cdp arm610 data sheet 4-48 open access 4.12.3 assembler syntax cdp{cond} p#,,cd,cn,cm{,} {cond} two character condition mnemonic. see table 4-2: condition code summary on page 4-6. p# the unique number of the required coprocessor evaluated to a constant and placed in the cp opc teld cd, cn and cm evaluate to the valid coprocessor register numbers crd, crn and crm respectively where present is evaluated to a constant and placed in the cp teld 4.12.4 examples cdp p1,10,c1,c2,c3 ; request coproc 1 to do operation 10 ; on cr2 and cr3, and put the result ; in cr1. cdpeq p2,5,c1,c2,c3,2 ; if z flag is set request coproc 2 ; to do operation 5 (type 2) on cr2 ; and cr3,and put the result in cr1.
instruction set - ldc, stc arm610 data sheet 4-49 open access 4.13 coprocessor data transfers (ldc, stc) the instruction is only executed if the condition is true. the various conditions are detned in table 4-2: condition code summary on page 4-6. the instruction encoding is shown in figure 4-24: coprocessor data transfer instructions . this class of instruction is used to load (ldc) or store (stc) a subset of a coprocessors?s registers directly to memory. arm610 is responsible for supplying the memory address, and the coprocessor supplies or accepts the data and controls the number of words transferred. figure 4-24: coprocessor data transfer instructions 4.13.1 the coprocessor telds the cp# teld is used to identify the coprocessor which is required to supply or accept the data, and a coprocessor will only respond if its number matches the contents of this teld. the crd teld and the n bit contain information for the coprocessor which may be interpreted in different ways by different coprocessors, but by convention crd is the register to be transferred (or the trst register where more than one is to be transferred), and the n bit is used to choose one of two transfer length options. for instance n=0 could select the transfer of a single register, and n=1 could select the transfer of all the registers for context switching. 31 28 27 25 24 23 22 21 20 19 16 15 12 11 8 7 0 cond 1110 p u n w l rn crd cp# offset unsigned 8-bit immediate offset coprocessor number coprocessor source/destination register base register load/store bit 0 = store to memory 1 = load from memory write-back bit 0 = no write-back 1 = write address into base transfer length up/down bit 0 = down; subtract offset from base 1 = up; add offset to base pre/post indexing bit 0 = post; add offset after transfer 1 = pre; add offset before transfer condition field
instruction set - ldc, stc arm610 data sheet 4-50 open access 4.13.2 addressing modes arm610 is responsible for providing the address used by the memory system for the transfer, and the addressing modes available are a subset of those used in single data transfer instructions. note, however, that the immediate offsets are 8 bits wide and specify word offsets for coprocessor data transfers, whereas they are 12 bits wide and specify byte offsets for single data transfers. the 8-bit unsigned immediate offset is shifted left 2 bits and either added to (u=1) or subtracted from (u=0) the base register (rn); this calculation may be performed either before (p=1) or after (p=0) the base is used as the transfer address. the modited base value may be overwritten back into the base register (if w=1), or the old value of the base may be preserved (w=0). note that post-indexed addressing modes require explicit setting of the w bit, unlike ldr and str which always write-back when post- indexed. the value of the base register, modited by the offset in a pre-indexed instruction, is used as the address for the transfer of the trst word. the second word (if more than one is transferred) will go to or come from an address one word (4 bytes) higher than the trst transfer, and the address will be incremented by one word for each subsequent transfer. 4.13.3 address alignment the base address should normally be a word-aligned quantity. the bottom 2 bits of the address will appear on a[1:0] and might be interpreted by the memory system. 4.13.4 use of r15 if rn is r15, the value used will be the address of the instruction plus 8 bytes. base write-back to r15 must not be specited. 4.13.5 data aborts if the address is legal but the memory manager generates an abort, the data trap will be taken. the writeback of the modited base will take place, but all other processor state will be preserved. the coprocessor is partly responsible for ensuring that the data transfer can be restarted after the cause of the abort has been resolved, and must ensure that any subsequent actions it undertakes can be repeated when the instruction is retried. 4.13.6 instruction cycle times coprocessor data transfer instructions take (n-1)s + 2n + bi incremental cycles to execute, where: n is the number of words transferred. b is the number of cycles spent in the coprocessor busy-wait loop. s, n and i are as detned in 6.2 cycle types on page 6-2.
instruction set - ldc, stc arm610 data sheet 4-51 open access 4.13.7 assembler syntax {cond}{l} p#,cd,
ldc load from memory to coprocessor stc store from coprocessor to memory {l} when present perform long transfer (n=1), otherwise perform short transfer (n=0) {cond} two character condition mnemonic. see table 4-2: condition code summary on page 4-6. p# the unique number of the required coprocessor cd is an expression evaluating to a valid coprocessor register number that is placed in the crd teld
can be: 1 an expression which generates an address: the assembler will attempt to generate an instruction using the pc as a base and a corrected immediate offset to address the location given by evaluating the expression. this will be a pc relative, pre-indexed address. if the address is out of range, an error will be generated. 2 a pre-indexed addressing speci?ation: [rn] offset of zero [rn,<#expression>]{!} offset of bytes 3 a post-indexed addressing speci?ation: [rn],<#expression> offset of bytes {!} write back the base register (set the w bit) if! is present rn is an expression evaluating to a valid arm610 register number. note if rn is r15, the assembler will subtract 8 from the offset value to allow for arm610 pipelining.
instruction set - ldc, stc arm610 data sheet 4-52 open access 4.13.8 examples ldc p1,c2,table ; load c2 of coproc 1 from address ; table, using a pc relative address. stceql p2,c3,[r5,#24]!; conditionally store c3 of coproc 2 ; into an address 24 bytes up from r5, ; write this address back to r5, and use ; long transfer option (probably to ; store multiple words). note although the address offset is expressed in bytes, the instruction offset field is in words. the assembler will adjust the offset appropriately.
instruction set - mrc, mcr arm610 data sheet 4-53 open access 4.14 coprocessor register transfers (mrc, mcr) the instruction is only executed if the condition is true. the various conditions are detned in table 4-2: condition code summary on page 4-6. the instruction encoding is shown in figure 4-25: coprocessor register transfer instructions . this class of instruction is used to communicate information directly between arm610 and a coprocessor. an example of a coprocessor to arm610 register transfer (mrc) instruction would be a fix of a ?oating point value held in a coprocessor, where the ?oating point number is converted into a 32-bit integer within the coprocessor, and the result is then transferred to arm610 register. a float of a 32-bit value in arm610 register into a ?oating point value within the coprocessor illustrates the use of arm610 register to coprocessor transfer (mcr). an important use of this instruction is to communicate control information directly from the coprocessor into the arm610 cpsr ?ags. as an example, the result of a comparison of two ?oating point values within a coprocessor can be moved to the cpsr to control the subsequent ?ow of execution. figure 4-25: coprocessor register transfer instructions 4.14.1 the coprocessor telds the cp# teld is used, as for all coprocessor instructions, to specify which coprocessor is being called upon. the cp opc, crn, cp and crm telds are used only by the coprocessor, and the interpretation presented here is derived from convention only. other interpretations are allowed where the coprocessor functionality is incompatible with this one. the conventional interpretation is that the cp opc and cp telds specify the operation the coprocessor is required to perform, crn is the coprocessor register which is the source or destination of the transferred information, and crm is a second coprocessor register which may be involved in some way which depends on the particular operation specited. 31 28 27 24 23 21 20 19 16 15 12 11 8 7 5 4 3 0 cond 1110 cp opc l crn rd cp# cp 1 crm coprocessor operand register coprocessor information coprocessor number arm source/destination register coprocessor source/destination load/store bit 0 = store to coprocessor 1 = load from coprocessor coprocessor operation mode condition field
instruction set - mrc, mcr arm610 data sheet 4-54 open access 4.14.2 transfers to r15 when a coprocessor register transfer to arm610 has r15 as the destination, bits 31, 30, 29 and 28 of the transferred word are copied into the n, z, c and v ?ags respectively. the other bits of the transferred word are ignored, and the pc and other cpsr bits are unaffected by the transfer. 4.14.3 transfers from r15 a coprocessor register transfer from arm610 with r15 as the source register will store the pc+12. 4.14.4 instruction cycle times mrc instructions take 1s + (b+1)i +1c incremental cycles to execute, where s, i and c are as detned in 6.2 cycle types on page 6-2. mcr instructions take 1s + bi +1c incremental cycles to execute, where b is the number of cycles spent in the coprocessor busy-wait loop. 4.14.5 assembler syntax {cond} p#,,rd,cn,cm{,} mrc move from coprocessor to arm610 register (l=1) mcr move from arm610 register to coprocessor (l=0) {cond} two character condition mnemonic. see table 4-2: condition code summary on page 4-6. p# the unique number of the required coprocessor evaluated to a constant and placed in the cp opc teld rd is an expression evaluating to a valid arm610 register number cn and cm are expressions evaluating to the valid coprocessor register numbers crn and crm respectively where present is evaluated to a constant and placed in the cp teld 4.14.6 examples mrc p2,5,r3,c5,c6 ; request coproc 2 to perform operation 5 ; on c5 and c6, and transfer the (single ; 32-bit word) result back to r3. mcr p6,0,r4,c5,c6 ; request coproc 6 to perform operation 0 ; on r4 and place the result in c6. mrceq p3,9,r3,c5,c6,2 ; conditionally request coproc 3 to ; perform operation 9 (type 2) on c5 and ; c6, and transfer the result back to r3.
instruction set - undefined arm610 data sheet 4-55 open access 4.15 undetned instruction the instruction is only executed if the condition is true. the various conditions are detned in table 4-2: condition code summary on page 4-6. the instruction format is shown in figure 4-26: undefined instruction . figure 4-26: undefined instruction if the condition is true, the undetned instruction trap will be taken. note that the undetned instruction mechanism involves offering this instruction to any coprocessors which may be present, and all coprocessors must refuse to accept it by driving cpa and cpb high. 4.15.1 assembler syntax the assembler has no mnemonics for generating this instruction. if it is adopted in the future for some specited use, suitable mnemonics will be added to the assembler. until such time, this instruction must not be used. 31 28 27 25 24 543 0 cond 011 xxxxxxxxxxxxxxxxxxxx 1 xxxx
instruction set - examples arm610 data sheet 4-56 open access 4.16 instruction set examples the following examples show ways in which the basic arm610 instructions can combine to give eftcient code. none of these methods saves a great deal of execution time (although they may save some), mostly they just save code. 4.16.1 using the conditional instructions using conditionals for logical or cmp rn,#p ; if rn=p or rm=q then goto label. beq label cmp rm,#q beq label this can be replaced by cmp rn,#p cmpne rm,#q ; if condition not satisfied try ; other test. beq label absolute value teq rn,#0 ; test sign rsbmi rn,rn,#0 ; and 2's complement if necessary. multiplication by 4, 5 or 6 (run time) mov rc,ra,lsl#2 ; multiply by 4, cmp rb,#5 ; test value, addcs rc,rc,ra ; complete multiply by 5, addhi rc,rc,ra ; complete multiply by 6. combining discrete and range tests teq rc,#127 ; discrete test, cmpne rc,#??1 ; range test movls rc,#? ; if rc<=??or rc=ascii(127) ; then rc:=?? division and remainder a number of divide routines for speci? applications are provided in source form as part of the ansi c library provided with the arm cross development toolkit, available from your supplier. a short general purpose divide routine follows. ; enter with numbers in ra and rb. ; mov rcnt,#1 ; bit to control the division. div1 cmp rb,#0x80000000 ; move rb until greater than ra. cmpcc rb,ra movcc rb,rb,asl#1 movcc rcnt,rcnt,asl#1 bcc div1 mov rc,#0
instruction set - examples arm610 data sheet 4-57 open access div2 cmp ra,rb ; test for possible subtraction. subcs ra,ra,rb ; subtract if ok, addcs rc,rc,rcnt ; put relevant bit into result movs rcnt,rcnt,lsr#1 ; shift control bit movne rb,rb,lsr#1 ; halve unless finished. bne div2 ; ; divide result in rc, ; remainder in ra. over?w detection in the arm610 1 over?w in unsigned multiply with a 32-bit result umull rd,rt,rm,rn ;3 to 6 cycles teq rt,#0 ;+1 cycle and a register bne overflow 2 over?w in signed multiply with a 32-bit result smull rd,rt,rm,rn ;3 to 6 cycles teq rt,rd asr#31 ;+1 cycle and a register bne overflow 3 over?w in unsigned multiply accumulate with a 32-bit result umlal rd,rt,rm,rn ;4 to 7 cycles teq rt,#0 ;+1 cycle and a register bne overflow 4 over?w in signed multiply accumulate with a 32-bit result smlal rd,rt,rm,rn ;4 to 7 cycles teq rt,rd, asr#31 ;+1 cycle and a register bne overflow 5 over?w in unsigned multiply accumulate with a 64-bit result umull rl,rh,rm,rn ;3 to 6 cycles adds rl,rl,ra1 ;lower accumulate adc rh,rh,ra2 ;upper accumulate bcs overflow ;1 cycle and 2 registers 6 over?w in signed multiply accumulate with a 64-bit result smull rl,rh,rm,rn ;3 to 6 cycles adds rl,rl,ra1 ;lower accumulate adc rh,rh,ra2 ;upper accumulate bvs overflow ;1 cycle and 2 registers note over?w checking is not applicable to unsigned and signed multiplies with a 64-bit result, since over?w does not occur in such calculations.
instruction set - examples arm610 data sheet 4-58 open access 4.16.2 pseudo-random binary sequence generator it is often necessary to generate (pseudo-) random numbers and the most eftcient algorithms are based on shift generators with exclusive-or feedback rather like a cyclic redundancy check generator. unfortunately the sequence of a 32-bit generator needs more than one feedback tap to be maximal length (i.e. 2^32-1 cycles before repetition), so this example uses a 33-bit register with taps at bits 33 and 20. the basic algorithm is newbit:=bit 33 eor bit 20, shift left the 33-bit number and put in newbit at the bottom; this operation is performed for all the newbits needed (i.e. 32 bits). the entire operation can be done in 5 s cycles: ; enter with seed in ra (32 bits), rb (1 bit in rb lsb), uses rc. ; tst rb,rb,lsr#1 ; top bit into carry movs rc,ra,rrx ; 33-bit rotate right adc rb,rb,rb ; carry into lsb of rb eor rc,rc,ra,lsl#12 ; (involved!) eor ra,rc,rc,lsr#20 ; (similarly involved!) ; new seed in ra, rb as before 4.16.3 multiplication by constant using the barrel shifter multiplication by 2^n (1,2,4,8,16,32..) mov ra, rb, lsl #n multiplication by 2^n+1 (3,5,9,17..) addra,ra,ra,lsl #n multiplication by 2^n-1 (3,7,15..) rsb ra,ra,ra,lsl #n multiplication by 6 add ra,ra,ra,lsl #1; multiply by 3 mov ra,ra,lsl#1; and then by 2 multiply by 10 and add in extra number add ra,ra,ra,lsl#2; multiply by 5 add ra,rc,ra,lsl#1; multiply by 2 and add in next digit general recursive method for rb := ra*c, c a constant: 1 if c even, say c = 2^n*d, d odd: d=1: mov rb,ra,lsl #n d<>1: {rb := ra*d} mov rb,rb,lsl #n
instruction set - examples arm610 data sheet 4-59 open access 2 if c mod 4 = 1, say c = 2^n*d+1, d odd, n>1: d=1: add rb,ra,ra,lsl #n d<>1: {rb := ra*d} add rb,ra,rb,lsl #n 3 if c mod 4 = 3, say c = 2^n*d-1, d odd, n>1: d=1: rsb rb,ra,ra,lsl #n d<>1: {rb := ra*d} rsb rb,ra,rb,lsl #n this is not quite optimal, but close. an example of its non-optimality is multiply by 45 which is done by: rsb rb,ra,ra,lsl#2 ; multiply by 3 rsb rb,ra,rb,lsl#2 ; multiply by 4*3-1 = 11 add rb,ra,rb,lsl# 2; multiply by 4*11+1 = 45 rather than by: add rb,ra,ra,lsl#3 ; multiply by 9 add rb,rb,rb,lsl#2 ; multiply by 5*9 = 45 4.16.4 loading a word from an unknown alignment ; enter with address in ra (32 bits) ; uses rb, rc; result in rd. ; note d must be less than c e.g. 0,1 ; bic rb,ra,#3 ; get word aligned address ldmia rb,{rd,rc} ; get 64 bits containing answer and rb,ra,#3 ; correction factor in bytes movs rb,rb,lsl#3 ; ...now in bits and test if aligned movne rd,rd,lsr rb ; produce bottom of result word ; (if not aligned) rsbne rb,rb,#32 ; get other shift amount orrne rd,rd,rc,lsl rb; combine two halves to get result
instruction set - examples arm610 data sheet 4-60 open access
arm610 data sheet 5-1 configuration this chapter explains how to con gure the arm610. 5.1 con guration 5-2 5.2 internal coprocessor instructions 5-2 5.3 registers 5-2 5
configuration arm610 data sheet 5-2 5.1 con?uration the operation and con?uration of arm610 is controlled both directly via coprocessor instructions and indirectly via the memory management page tables. the coprocessor instructions manipulate a number of on-chip registers which control the con?uration of the cache, write buffer, mmu and a number of other con?uration options. to ensure backwards compatibility of future cpus, all reserved or unused bits in registers and coprocessor instructions should be programmed to '0'. invalid registers must not be read or written. the following bits should be programmed to '0'. register 1 bits[31:9] register 2 bits[13:0] register 5 bits[31:0] register 6 bits[11:0] register 7 bits[31:0] note: the grey areas in the register and translation diagrams are reserved and should be programmed 0 for future compatibility. 5.2 internal coprocessor instructions the on-chip registers may be read using mrc instructions and written using mcr instructions. these operations are only allowed in non-user modes and the unde?ed instruction trap will be taken if accesses are attempted in user mode. figure 5-1: format of internal coprocessor instructions mrc and mcr 5.3 registers arm610 contains registers which control the cache and mmu operation. these registers are accessed using cprt instructions to coprocessor #15 with the processor in a privileged mode. only some of registers 0-7 are valid: an access to an invalid register will cause neither the access nor an unde?ed instruction trap, and therefore should never be carried out; an access to any of the registers 8-15 will cause the unde?ed instruction trap to be taken. 1110 n 1111 1 0 3 4 5 7 8 11 12 15 16 19 20 21 23 24 27 28 31 cond crn rd arm condition codes arm610 register arm register -1 mrc register read 0 mcr register write
configuration arm610 data sheet 5-3 5.3.1 register 0 id register 0 is a read-only identity register that returns the arm ltd code for this chip: 0x4156061x. 5.3.2 register 1 control register 1 is write only and contains control bits. all bits in this register are forced low by reset. m bit 0 enable/disable 0 - on-chip memory management unit turned off 1 - on-chip memory management unit turned on. a bit 1 address fault enable/disable 0 - alignment fault disabled 1 - alignment fault enabled register register reads register writes 0 id register reserved 1 reserved control 2 reserved translation table base 3 reserved domain access control 4 reserved reserved 5 fault status flush tlb 6 fault address purge tlb 7 reserved flush idc 8-15 reserved reserved table 5-1: cache and mmu control registers 56 0 3 4 15 16 23 24 31 41 revision 061 0 1 2 3 4 5 6 7 8 9 30 31 00000000000000000000000sbldpw a cm
configuration arm610 data sheet 5-4 c bit 2 cache enable/disable 0 - instruction / data cache turned off 1 - instruction / data cache turned on w bit 3 write buffer enable/disable 0 - write buffer turned off 1 - write buffer turned on p bit 4 arm 32/26-bit program space 0 - 26-bit program space selected 1 - 32-bit program space selected d bit 5 arm 32/26-bit data space 0 - 26-bit data space selected 1 - 32-bit data space selected l bit 6 late abort timing 0 - early abort mode selected 1 - late abort mode selected b bit 7 big/little endian 0 - little-endian operation 1 - big-endian operation s bit 8 system this bit controls the arm610 permission system. 5.3.3 register 2 translation table base register 2 is a write-only register which holds the base of the currently active level one page table. 5.3.4 register 3 domain access control register 3 is a write-only register which holds the current access control for domains 0 to 15. see 9.14 domain access control on page 9-14 for the access permission de?itions and other details. 0 13 14 31 translation table base 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
configuration arm610 data sheet 5-5 5.3.5 register 4 reserved register 4 is reserved. accessing this register has no effect, but should never be attempted. 5.3.6 register 5 read: fault status reading register 5 returns the status of the last data fault. it is not updated for a prefetch fault. see chapter 9, memory management unit for more details. note that only the bottom 12 bits are returned. the upper 20 bits will be the last value on the internal data bus, and therefore will have no meaning. bits 11:8 are always returned as zero. write: translation lookaside buffer flush writing register 5 ?shes the tlb. (the data written is discarded). 5.3.7 register 6 read: fault address reading register 6 returns the virtual address of the last data fault. write: tlb purge writing register 6 purges the tlb; the data is treated as an address and the tlb is searched for a corresponding page table descriptor. if a match is found, the corresponding entry is marked as invalid. this allows the page table descriptors in main memory to be updated and invalid entries in the on-chip tlb to be purged without requiring the entire tlb to be ?shed. 0 31 purge address 13 14 0 0 0 0 domain status 0 3 4 7 8 11 12 31 0 31 fault address
configuration arm610 data sheet 5-6 5.3.8 register 7 idc flush register 7 is a write-only register. the data written to this register is discarded and the idc is ?shed. 5.3.9 registers 8 -15 reserved accessing any of these registers will cause the unde?ed instruction trap to be taken.
arm610 data sheet 6-1 instruction and data cache (idc) this chapter describes the instruction and data cache of the arm610. 6.1 introduction 6-2 6.2 cacheable bit - c 6-2 6.3 updateable bit - u 6-2 6.4 idc operation 6-2 6.5 idc validity 6-3 6.6 read-lock-write 6-3 6.7 idc enable/disable and reset 6-4 6
instruction and data cache (idc) arm610 data sheet 6-2 6.1 introduction arm610 contains a 4kbyte mixed instruction and data cache. the idc has 256 lines of 16 bytes (four words), organised as a 64-way set-associative cache, and uses the virtual addresses generated by the processor core. the idc is always reloaded a line at a time (four words). it may be enabled or disabled via the arm610 control register and is disabled on nreset . the operation of the cache is further controlled by two bits: cacheable and updateable , which are stored in the memory management page tables (see chapter 9, memory management unit ). for this reason, in order to use the idc, the mmu must be enabled. the two functions may however be enabled simultaneously, with a single write to the control register. 6.2 cacheable bit - c the cacheable bit determines whether data being read may be placed in the idc and used for subsequent read operations. typically main memory will be marked as cacheable to improve system performance, and i/o space as non-cacheable to stop the data being stored in arm610's cache. for example if the processor is polling a hardware ?g in i/o space, it is important that the processor is forced to read data from the external peripheral, and not a copy of initial data held in the cache. the cacheable bit can be con?ured for both pages and sections. 6.3 updateable bit - u the updateable bit determines whether the data in the cache should be updated during a write operation to maintain consistency with the external memory. in certain cases automatic updating of cached data is not required: for instance, when using the memc1a memory manager, a read operation in the address space between 3400000h -3ffffffh would access the roms, but a write operation in the same address space would change a memc register, and should not affect the cached rom data. the updateable bit can only be con?ured by the level one descriptor: that is an entire section or all the pages for a single level one descriptor share the same con?uration. 6.4 idc operation when the processor performs a read or write operation, the translation entry for that address is inspected and the state of the cacheable and updateable bits determines the subsequent action. 6.4.1 cacheable reads c = 1 the cache is searched for the relevant data; if found in the cache, the data is fed to the processor using a fast clock cycle (from fclk ). if the data is not found in the cache, an external memory access is initiated to read the appropriate line of data (four words) from external memory and it is stored in a pseudo-randomly chosen entry in the cache (a linefetch operation).
instruction and data cache (idc) arm610 data sheet 6-3 6.4.2 uncacheable reads c = 0 the cache is not searched for the relevant data; instead an external memory access is initiated. no linefetch operation is performed, and the cache is not updated. 6.4.3 updateable writes u = 1 an external memory access is initiated, and the cache is searched; if the cache holds a copy of the data from the address being written to, then the cache data is simultaneously updated. 6.4.4 non-updateable writes u = 0 an external memory access is initiated, but the cache is not searched and the contents of the cache are not affected. 6.5 idc validity the idc operates with virtual addresses, so care must be taken to ensure that its contents remain consistent with the virtual to physical mappings performed by the memory management unit. if the memory mappings are changed, the idc validity must be ensured. 6.5.1 software idc ?sh the entire idc may be marked as invalid by writing to the arm610 idc flush register (register 7). the cache will be ?shed immediately the register is written, but note that the following two instruction fetches may come from the cache before the register is written. 6.5.2 doubly mapped space since the cache works with virtual addresses, it is assumed that every virtual address maps to a different physical address. if the same physical location is accessed by more than one virtual address, the cache cannot maintain consistency, since each virtual address will have a separate entry in the cache, and only one entry will be updated on a processor write operation. to avoid any cache inconsistencies, both doubly-mapped virtual addresses should be marked as uncacheable. 6.6 read-lock-write the idc treats the read-locked-write instruction as a special case. the read phase always forces a read of external memory, regardless of whether the data is contained in the cache. the write phase is treated as a normal write operation (and if marked as updateable, and the data is already in the cache, the cache will be updated). externally the two phases are ?gged as indivisible by asserting the lock signal.
instruction and data cache (idc) arm610 data sheet 6-4 6.7 idc enable/disable and reset the idc is automatically disabled and ?shed on nreset . once enabled, cacheable read accesses will cause lines to be placed in the cache. if subsequently disabled, no new lines will be placed in the cache, and the cache is not searched, but, updateable write operations will continue to operate, thus maintaining consistency with the external memory. if the cache is subsequently re-enabled, it must be ?shed if data already in the cache no longer matches that in external memory. 6.7.1 to enable the idc to enable the idc, make sure that the mmu is enabled ?st by setting bit 0 in control register, then enable the idc by setting bit 2 in control register. the mmu and idc may be enabled simultaneously with a single control register write. 6.7.2 to disable the idc to disable the idc clear bit 2 in control register. note updateable writes continue but no linefetches are performed. to fully inhibit the cache's operation it should be disabled and then flushed to ensure it contains no valid entries.
arm610 data sheet 7-1 write buffer (wb) this chapter describes the write buffer of the arm610. 7.1 introduction 7-2 7.2 bufferable bit 7-2 7.3 write buffer operation 7-2 7
write buffer (wb) arm610 data sheet 7-2 7.1 introduction the arm610 write buffer is provided to improve system performance. it can buffer up to eight words of data, and two independent addresses. it may be enabled or disabled via the w bit (bit 3) in the arm610 control register and the buffer is disabled and ?shed on reset. the operation of the write buffer is further controlled by one bit, b, or bufferable, which is stored in the memory management page tables. for this reason, in order to use the write buffer, the mmu must be enabled. the two functions may however be enabled simultaneously, with a single write to the control register. for a write to use the write buffer, both the w bit in the control register, and the b bit in the corresponding page table must be set. 7.2 bufferable bit this bit controls whether a write operation may or may not use the write buffer. typically main memory will be bufferable and i/o space unbufferable. the bufferable bit can be con?ured for both pages and sections. 7.3 write buffer operation when the cpu performs a write operation, the translation entry for that address is inspected and the state of the b bit determines the subsequent action. if the write buffer is disabled via the arm610 control register, bufferable writes are treated in the same way as unbuffered writes. 7.3.1 bufferable write if the write buffer is enabled and the processor performs a write to a bufferable area, the data is placed in the write buffer at fclk speeds and the cpu continues execution. the write buffer then performs the external write in parallel. if however the write buffer is full (either because there are already eight words of data in the buffer, or because there is no slot for the new address) then the processor is stalled until there is suf?ient space in the buffer. 7.3.2 unbufferable writes if the write buffer is disabled or the cpu performs a write to an unbufferable area, the processor is stalled until the write buffer empties and the write completes externally, which may require synchronisation and several external clock cycles. 7.3.3 read-lock-write the write phase of a read-lock-write sequence is treated as an unbuffered write, even if it is marked as buffered. note a single write requires one address slot and one data slot in the write buffer; a sequential write of n words requires one address slot and n data slots. the total of 8 data slots in the buffer may be used as required. so for instance there could be one non-sequential write and one sequential write of seven words in the buffer, and the processor could continue as normal: a third write or an eighth word in the second write would stall the processor until the first write had completed.
write buffer (wb) arm610 data sheet 7-3 7.3.4 to enable the write buffer to enable the write buffer, ensure the mmu is enabled by setting bit 0 in control register, then enable the write buffer by setting bit 3 in control register. the mmu and write buffer may be enabled simultaneously with a single write to the control register. 7.3.5 to disable the write buffer to disable the write buffer, clear bit 3 in control register. note any writes already in the write buffer will complete normally.
write buffer (wb) arm610 data sheet 7-4
arm610 data sheet 8-1 coprocessors this chapter introduces the use of coprocessors with the arm610. 8.1 overview 8-2 8
coprocessors arm610 data sheet 8-2 8.1 overview arm610 has no external coprocessor bus, so it is not possible to add external coprocessors to this device. arm610 does have an internal coprocessor designated #15 for internal control of the device. if a coprocessor other than #15 is accessed, the cpu will take the unde?ed instruction trap.
arm610 data sheet 9-1 memory management unit this chapter describes the arm610 memory management unit (mmu). 9.1 memory management unit (mmu) 9-2 9.2 mmu program accessible registers 9-2 9.3 address translation 9-3 9.4 translation process 9-4 9.5 level one descriptor 9-5 9.6 page table descriptor 9-6 9.7 section descriptor 9-7 9.8 translating section references 9-8 9.9 level two descriptor 9-9 9.10 translating small page references 9-10 9.11 translating large page references 9-11 9.12 mmu faults and cpu aborts 9-12 9.13 fault address and fault status registers (far and fsr) 9-12 9.14 domain access control 9-14 9.15 fault checking sequence 9-15 9.16 external aborts 9-17 9.17 interaction of the mmu, idc and write buffer 9-18 9.18 effect of reset 9-19 9
memory management unit arm610 data sheet 9-2 9.1 memory management unit (mmu) the mmu performs two primary functions: it translates virtual addresses into physical addresses, and it controls memory access permissions. the mmu hardware required to perform these functions consists of a translation look-aside buffer (tlb), access control logic, and translation table walking logic. the mmu supports memory accesses based on sections or pages. sections are comprised of 1mb blocks of memory. two different page sizes are supported: small pages consist of 4kb blocks of memory and large pages consist of 64kb blocks of memory. (large pages are supported to allow mapping of a large region of memory while using only a single entry in the tlb). additional access control mechanisms are extended within small pages to 1kb sub-pages and within large pages to 16kb sub- pages. the mmu also supports the concept of domains?reas of memory that can be de?ed to possess individual access rights. the domain access control register is used to specify access rights for up to 16 separate domains. the tlb caches 32 translated entries. during most memory accesses, the tlb provides the translation information to the access control logic. if the tlb contains a translated entry for the virtual address, the access control logic determines whether access is permitted. if access is permitted, the mmu outputs the appropriate physical address corresponding to the virtual address. if access is not permitted, the mmu signals the cpu to abort. if the tlb misses (ie. does not contain a translated entry for the virtual address), the translation table walk hardware is invoked to retrieve the translation information from a translation table in physical memory. once retrieved, the translation information is placed into the tlb, possibly overwriting an existing value. the entry to be overwritten is chosen by cycling sequentially through the tlb locations. when the mmu is turned off (as happens on reset), the virtual address is output directly onto the physical address bus. 9.2 mmu program accessible registers the arm610 processor provides several 32-bit registers which determine the operation of the mmu. the format for these registers is shown in figure 0-1: mmu register summary on page -3. a brief description of the registers is provided below. each register will be discussed in more detail within the section that describes its use. data is written to and read from the mmu's registers using the arm cpu's mrc and mcr coprocessor instructions.
memory management unit arm610 data sheet 9-3 translation table base register this holds the physical address of the base of the translation table maintained in main memory. note that this base must reside on a 16kb boundary. domain access control register this consists of sixteen 2-bit ?lds, each of which de?es the access permissions for one of the sixteen domains (d15-d0). note the registers not shown are reserved and should not be used. fault status register this indicates the domain and type of access being attempted when an abort occurred. bits 7:4 specify which of the sixteen domains (d15-d0) was being accessed when a fault occurred. bits 3:1 indicate the type of access being attempted. the encoding of these bits is different for internal and external faults (as indicated by bit 0 in the register) and is shown in table 9-4: priority encoding of fault status on page 9-12. a write to this register flushes the tlb. fault address register this holds the virtual address of the access which was attempted when a fault occurred. a write to this register causes the data written to be treated as an address and, if it is found in the tlb, the entry is marked as invalid. (this operation is known as a tlb purge). the fault status register and fault address register are only updated for data faults, not for prefetch faults. 9.3 address translation the mmu translates virtual addresses generated by the cpu into physical addresses to access external memory, and also derives and checks the access permission. translation information, which consists of both the address translation data and the domain access control 0 control l d p w a cm translation table base 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 0 0 0 domain status 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 flush tlb purge address fault address register 1 write 2 write 3 write 5 read 5 write 6 read 6 write fault status sb figure 9-1: mmu register summary
memory management unit arm610 data sheet 9-4 access permission data, resides in a translation table located in physical memory. the mmu provides the logic needed to traverse this translation table, obtain the translated address, and check the access permission. there are three routes by which the address translation (and hence the permission check) takes place. the route taken depends on whether the address in question has been marked as a section-mapped access or a page-mapped access; and there are two sizes of page-mapped access (large pages and small pages). however, the translation process always starts out in the same way, as described below, with a level one fetch. a section-mapped access only requires a level one fetch, but a page- mapped access also requires a level two fetch. 9.4 translation process 9.4.1 translation table base the translation process is initiated when the on-chip tlb does not contain an entry for the requested virtual address. the translation table base (ttb) register points to the base of a table in physical memory which contains section and/or page descriptors. the 14 low-order bits of the ttb register are set to zero as illustrated in figure 9-2: translation table base register ; the table must reside on a 16kb boundary. figure 9-2: translation table base register 9.4.2 level one fetch bits 31:14 of the translation table base register are concatenated with bits 31:20 of the virtual address to produce a 30-bit address as illustrated in figure 9-3: accessing the translation table first level descriptors . this address selects a four-byte translation table entry which is a first level descriptor for either a section or a page (bit 1 of the descriptor returned speci?s whether it is for a section or page). 0 13 14 31 translation table base
memory management unit arm610 data sheet 9-5 figure 9-3: accessing the translation table first level descriptors 9.5 level one descriptor the level one descriptor returned is either a page table descriptor or a section descriptor, and its format varies accordingly. the following ?ure illustrates the format of level one descriptors. figure 9-4: level one descriptors 0 19 20 31 0 31 table index section index virtual address translation base 13 14 translation table base 0 31 translation base 13 14 00 1 2 table index 18 12 first level descriptor 0 31 0 1 2 3 4 5 8 9 10 11 12 19 20 31 0 fault page section reserved 0 01 10 11 cb domain domain ap page table base address section base address u u
memory management unit arm610 data sheet 9-6 the two least signi?ant bits indicate the descriptor type and validity, and are interpreted as shown below. 9.6 page table descriptor bits 3:2 are always written as 0. bit 4 updateable : indicates that the data in the cache should be updated during a write operation to maintain consistency with external memory (if the cache is enabled). bits 8:5 specify one of the sixteen possible domains (held in the domain access control register) that contain the primary access controls. bits 31:10 form the base for referencing the page table entry. (the page table index for the entry is derived from the virtual address as illustrated in figure 9-7: small page translation on page 9-10). if a page table descriptor is returned from the level one fetch, a level two fetch is initiated as described below. value meaning notes 0 0 invalid generates a section translation fault 0 1 page indicates that this is a page descriptor 1 0 section indicates that this is a section descriptor 1 1 reserved reserved for future use table 9-1: interpreting level one descriptor bits [1:0]
memory management unit arm610 data sheet 9-7 9.7 section descriptor bits 4:2 (u,c, & b) control the cache- and write-buffer-related functions as follows: u - updateable : indicates that the data in the cache should be updated during a write operation to maintain consistency with external memory (if the cache is enabled). c - cacheable : indicates that data at this address will be placed in the cache (if the cache is enabled). b - bufferable : indicates that data at this address will be written through the write buffer (if the write buffer is enabled). bits 8:5 specify one of the sixteen possible domains (held in the domain access control register) that contain the primary access controls. bits 11:10 (ap) specify the access permissions for this section and are interpreted as shown in table 9-2: interpreting access permission (ap) bits . their interpretation is dependent upon the setting of the s bit (control register bit 8). note that the domain access control speci?s the primary access control; the ap bits only have an effect in client mode. refer to section on access permissions. bits 19:12 are always written as 0. bits 31:20 form the corresponding bits of the physical address for the 1mb section. ap s supervisor permissions user permissions notes 00 0 no access no access any access generates a permission fault 00 1 read only no access supervisor read only permitted 01 x read/write no access access allowed only in supervisor mode 10 x read/write read only writes in user mode cause permission fault 11 x read/write read/write all access types permitted in both modes. table 9-2: interpreting access permission (ap) bits
memory management unit arm610 data sheet 9-8 9.8 translating section references figure 9-5: section translation illustrates the complete section translation sequence. note that the access permissions contained in the level one descriptor must be checked before the physical address is generated. the sequence for checking access permissions is described below. figure 9-5: section translation 0 19 20 31 10 cb domain ap section base address 0 31 table index section index virtual address translation base 0 1 2 3 4 5 8 9 10 11 12 19 20 31 13 14 translation table base 0 31 translation base 13 14 00 1 2 table index first level descriptor 0 19 20 31 section base address section index physical address 12 20 18 12 u
memory management unit arm610 data sheet 9-9 9.9 level two descriptor if the level one fetch returns a page table descriptor, this provides the base address of the page table to be used. the page table is then accessed as described in figure 9-7: small page translation on page 9-10, and a page table entry, or level two descriptor, is returned. this in turn may de?e either a small page or a large page access. the ?ure below shows the format of level two descriptors. figure 9-6: page table entry (level two descriptor) the two least signi?ant bits indicate the page size and validity, and are interpreted as follows. bit 2 b - bufferable : indicates that data at this address will be written through the write buffer (if the write buffer is enabled). bit 3 c - cacheable : indicates that data at this address will be placed in the idc (if the cache is enabled). bits 11:4 specify the access permissions (ap3 - ap0) for the four sub-pages and interpretation of these bits is described earlier in table 9-1: interpreting level one descriptor bits [1:0] on page 9-6. for large pages, bits 15:12 are programmed as 0 bits 31:12 (small pages) or bits 31:16 (large pages) are used to form the corresponding bits of the physical address - the physical page number. (the page index is derived from the virtual address as illustrated in figure 9-7: small page translation on page 9-10 and figure 9-8: large page translation on page 9-11). 0 1 2 3 4 5 8 9 10 11 12 19 20 31 0 fault large page small page reserved 0 01 10 11 cb ap3 large page base address small page base address 6 7 15 16 ap3 ap2 ap2 ap1 ap1 ap0 ap0 cb value meaning notes 0 0 invalid generates a page translation fault 0 1 large page indicates that this is a 64kb page 1 0 small page indicates that this is a 4 kb page 1 1 reserved reserved for future use table 9-3: interpreting page table entry bits 1:0
memory management unit arm610 data sheet 9-10 9.10 translating small page references figure 9-7: small page translation illustrates the complete translation sequence for a 4kb small page. page translation involves one additional step beyond that of a section translation: the level one descriptor is the page table descriptor, and this is used to point to the level two descriptor, or page table entry. (note that the access permissions are now contained in the level two descriptor and must be checked before the physical address is generated. the sequence for checking access permissions is described later.) figure 9-7: small page translation 0 19 20 31 0 31 table index page index virtual address translation base 13 14 translation table base 0 31 translation base 13 14 00 1 2 table index first level descriptor 18 12 01 domain page table base address 0 1 2 4 5 8 9 10 31 00 page table base address 0 1 2 9 10 31 l2 table index 11 12 l2 table index 10 cb ap3 page base address 0 1 2 3 4 5 8 9 10 11 12 31 second level descriptor 6 7 ap2 ap1 ap0 page base address 0 11 12 31 page index physical address 12 8 u
memory management unit arm610 data sheet 9-11 9.11 translating large page references figure 9-8: large page translation illustrates the complete translation sequence for a 64kb large page. note that since the upper four bits of the page index and low-order four bits of the page table index overlap, each page table entry for a large page must be duplicated 16 times (in consecutive memory locations) in the page table. figure 9-8: large page translation 0 19 20 31 0 31 table index page index virtual address translation base 13 14 translation table base 0 31 translation base 13 14 00 1 2 table index first level descriptor 18 12 01 domain page table base address 0 1 2 4 5 8 9 10 31 00 page table base address 0 1 2 9 10 31 l2 table index 11 12 l2 table index 01 cb ap3 page base address 0 1 2 3 4 5 8 9 10 11 12 31 second level descriptor 6 7 ap2 ap1 ap0 page base address 0 31 page index physical address 12 8 15 16 15 16 15 16 u
memory management unit arm610 data sheet 9-12 9.12 mmu faults and cpu aborts the mmu generates four types of faults: alignment translation domain permission in addition, an external abort may be raised on external data access. the access control mechanisms of the mmu detect the conditions that produce these faults. if a fault is detected as the result of a memory access, the mmu will abort the access and signal the fault condition to the cpu. the mmu is also capable of retaining status and address information about the abort. the cpu recognises two types of abort: data aborts and prefetch aborts, and these are treated differently by the mmu. if the mmu detects an access violation, it will do so before the external memory access takes place, and it will therefore inhibit the access. external aborts will not necessarily inhibit the external access, as described in the section on external aborts. 9.13 fault address and fault status registers (far and fsr) aborts resulting from data accesses (data aborts) are acted upon by the cpu immediately, and the mmu places an encoded 4-bit value fs[3:0], along with the 4-bit encoded domain number, in the fault status register (fsr). in addition, the virtual processor address which caused the data abort is latched into the fault address register (far). if an access violation simultaneously generates more than one source of abort, they are encoded in the priority given in table 9-4: priority encoding of fault status . cpu instructions on the other hand are prefetched, so a prefetch abort simply ?gs the instruction as it enters the instruction pipeline. only when (and if) the instruction is executed does it cause an abort; an abort is not acted upon if the instruction is not used (ie. it is branched around). because instruction prefetch aborts may or may not be acted upon, the mmu status information is not preserved for the resulting cpu abort; for a prefetch abort, the mmu does not update the fsr or far. the sections that follow describe the various access permissions and controls supported by the mmu and detail how these are interpreted to generate faults. source fs[3210] domain[3:0] far highest write buffer 00x0 x note 3 bus error (linefetch) section 0100 valid note 4 page 0110 valid valid bus error (other) section 1000 valid valid table 9-4: priority encoding of fault status
memory management unit arm610 data sheet 9-13 x is unde?ed: may read as 0 or 1 notes: 1 any abort masked by the priority encoding may be regenerated by ?ing the primary abort and restarting the instruction. 2 in fact this register will contain bits[8:5] of the level 1 entry which are unde?ed, but would encode the domain in a valid entry. 3 the write buffer bus error is asynchronous and not restartable. the fault address register re?cts the ?st data operation that could be aborted. the areas of memory which generate external aborts should not be marked as bufferable. 4 the entry will be valid if the error was ?gged on word 0 of the linefetch. otherwise the domain and far may be invalid and the cache line may contain invalid data. page 1010 valid valid alignment 00x1 x valid bus error (translation) level1 1100 x valid level2 1110 valid valid translation section 0101 note 2 valid page 0111 valid valid domain section 1001 valid valid page 1011 valid valid permission section 1101 valid valid lowest page 1111 valid valid source fs[3210] domain[3:0] far table 9-4: priority encoding of fault status (continued)
memory management unit arm610 data sheet 9-14 9.14 domain access control mmu accesses are primarily controlled via domains. there are 16 domains, and each has a 2-bit ?ld to de?e it. two basic kinds of users are supported: clients and managers. clients use a domain; managers control the behaviour of the domain. the domains are de?ed in the domain access control register. figure 9-9: domain access control register format illustrates how the 32 bits of the register are allocated to de?e the sixteen 2-bit domains. figure 9-9: domain access control register format table 9-5: interpreting access bits in domain access control register de?es how the bits within each domain are interpreted to specify the access permissions. 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 value meaning notes 00 no access any access will generate a domain fault. 01 client accesses are checked against the access permission bits in the section or page descriptor. 10 reserved reserved. currently behaves like the no access mode. 11 manager accesses are not checked against the access permission bits so a permission fault cannot be generated. table 9-5: interpreting access bits in domain access control register
memory management unit arm610 data sheet 9-15 9.15 fault checking sequence the sequence by which the mmu checks for access faults is slightly different for sections and pages. the ?ure below illustrates the sequence for both types of accesses. the sections and ?ures that follow describe the conditions that generate each of the faults. figure 9-10: sequence for checking faults violation no access(00) reserved(10) virtual address check address alignment no access(00) reserved(10) client(01) client(01) manager(01) check access permissions check access permissions physical address violation misaligned alignment fault invalid section translation fault get level one descriptor section page section domain fault get page table entry invalid check domain status section page page domain fault page translation fault sub-page permission fault section permission fault
memory management unit arm610 data sheet 9-16 9.15.1 alignment fault if alignment fault is enabled (bit 1 in control register set), the mmu will generate an alignment fault on any data word access the address of which is not word-aligned irrespective of whether the mmu is enabled or not; in other words, if either of virtual address bits [1:0] are not 0. alignment fault will not be generated on any instruction fetch, nor on any byte access. if the access generates an alignment fault, the access sequence will abort without reference to further permission checks. 9.15.2 translation fault there are two types of translation fault: section and page. 1 a section translation fault is generated if the level one descriptor is marked as invalid. this happens if bits [1:0] of the descriptor are both 0 or both 1. 2 a page translation fault is generated if the page table entry is marked as invalid. this happens if bits [1:0] of the entry are both 0 or both 1. 9.15.3 domain fault there are two types of domain fault: section and page. in both cases the level one descriptor holds the 4-bit domain ?ld which selects one of the sixteen 2-bit domains in the domain access control register. the two bits of the speci?d domain are then checked for access permissions as detailed in table 9-2: interpreting access permission (ap) bits on page 9-7. in the case of a section, the domain is checked once the level one descriptor is returned, and in the case of a page, the domain is checked once the page table entry is returned. if the speci?d access is either no access (00) or reserved (10) then either a section domain fault or page domain fault occurs. 9.15.4 permission fault there are two types of permission fault: section and sub-page. permission fault is checked at the same time as domain fault. if the 2-bit domain ?ld returns client (01), then the permission access check is invoked as follows: section if the level one descriptor defines a section-mapped access, then the ap bits of the descriptor define whether or not the access is allowed according to table 9-2: interpreting access permission (ap) bits on page 9-7. their interpretation is dependent upon the setting of the s bit (control register bit 8). if the access is not allowed, then a section permission fault is generated. sub-page if the level one descriptor defines a page-mapped access, then the level two descriptor specifies four access permission fields (ap3..ap0) each corresponding to one quarter of the page. hence for small pages, ap3 is selected by the top 1kb of the page, and ap0 is selected by the bottom 1kb of the page; for large pages, ap3 is selected by the top 16kb of the page, and ap0 is selected by the bottom 16kb of the page. the selected ap bits are then
memory management unit arm610 data sheet 9-17 interpreted in exactly the same way as for a section (see table 9-2: interpreting access permission (ap) bits on page 9-7), the only difference being that the fault generated is a sub-page permission fault. 9.16 external aborts in addition to the mmu-generated aborts, arm610 has an external abort pin which may be used to ?g an error on an external memory access. however, some accesses aborted in this way are not restartable, so this pin must be used with great care. the following section describes the restrictions. the following accesses may be aborted and restarted safely. if any of the following are aborted the external access will cease on the next cycle. in the case of a read-lock- write sequence in which the read aborts, the write will not happen. uncacheable reads unbuffered writes level one descriptor fetch level two descriptor fetch read-lock-write sequence cacheable reads (linefetches) a linefetch may be aborted safely provided the abort is ?gged on word 0. in this case, the idc will not be updated or corrupted and the access will be restartable. it is not advisable to ?g an abort on any word other than word 0 of a linefetch, as the idc will contain a corrupt line, and the instruction may not be restartable. on the external bus, an externally aborted linefetch will continue to the end as though it had not aborted. buffered writes buffered writes cannot be safely externally aborted. because the processor will have moved on before the external abort is received, this class of abort is not restartable. if the system does ?g this type of abort, then the fault status register will record the fact, but this is a non-recoverable error, and the machine must be reset. therefore, the system should be con?ured such that it does not do buffered writes to areas of memory which are capable of ?gging an external abort. if a buffered write burst is externally aborted, then the external write will continue to the end.
memory management unit arm610 data sheet 9-18 9.17 interaction of the mmu, idc and write buffer the mmu, idc and wb may be enabled/disabled independently. however there are only ?e valid combinations. there are no hardware interlocks on these restrictions, so invalid combinations will cause unde?ed results. the following procedures must be observed. to enable the mmu 1 program the translation table base and domain access control registers 2 program level 1 and level 2 page tables as required 3 enable the mmu by setting bit 0 in the control register note care must be taken if the translated address differs from the untranslated address, as the two instructions following the enabling of the mmu will have been fetched using at translation and enabling the mmu may be considered as a branch with delayed execution. a similar situation occurs when the mmu is disabled. consider the following code sequence: mov r1, #0x1 mcr 15,0,r1,0,0 ; enable mmu fetch flat fetch flat fetch translated to disable the mmu 1 disable the wb by clearing bit 3 in the control register. 2 disable the idc by clearing bit 2 in the control register. 3 disable the mmu by clearing bit 0 in the control register. mmu idc wb off off off on off off on on off on off on on on on figure 9-11: valid mmu, idc and write buffer combinations
memory management unit arm610 data sheet 9-19 note if the mmu is enabled, then disabled and subsequently re-enabled, the contents of the tlb will have been preserved. if these are now invalid, the tlb should be flushed before re-enabling the mmu. all three functions may be disabled simultaneously. 9.18 effect of reset see 3.6 reset on page 3-10.
memory management unit arm610 data sheet 9-20
arm610 data sheet 10-1 bus interface this chapter describes the arm610 bus interface. 10.1 introduction 10-2 10.2 arm610 cycle speed 10-2 10.3 cycle types 10-2 10.4 memory access 10-2 10.5 read/write 10-3 10.6 byte/word 10-3 10.7 maximum sequential length 10-3 10.8 memory access types 10-5 10.9 arm610 cycle type summary 10-9 10
bus interface arm610 data sheet 10-2 10.1 introduction the arm610 has two input clocks: fclk and mclk . the bus interface is always controlled by mclk . the core cpu switches between these two clocks according to the operation being carried out. for example, if the core cpu is reading data from the cache it will be clocked by fclk , whereas if it is reading data from uncached external memory it will be clocked by mclk . the arm610 control logic ensures that the correct clock is used internally and automatically switches between the two clocks. the arm610 bus interface is designed to operate in synchronous mode. in this mode, there is a tightly de?ed relationship between fclk and mclk . mclk may only make transitions on the falling edge of fclk . an amount of jitter between the two clocks is permitted, and the device will function correctly, but mclk must not be later than fclk . refer to 13.2 relationship between fclk and mclk on page 13-2. 10.2 arm610 cycle speed the bus interface is controlled by mclk , and all timing parameters are referenced with respect to this clock. the speed of the memory may be controlled in one of two ways. 1 the low and high phases of the clock may be stretched. 2 nwait can be used to insert entire mclk cycles into the access. when low, this signal maintains the low phase of the cycle by gating out mclk . nwait may only change when mclk is low. 10.3 cycle types there are two basic cycle types performed by an arm610. these are idle cycles and memory cycles. idle cycles and memory cycles are combined to perform memory accesses. the two cycle types are differentiated by the signal nmreq . ( seq is the inverse of nmreq , and is provided for backwards compatibility with earlier memory controllers). nmreq high indicates an idle cycle, and nmreq low indicates a memory access. however, nmreq is pipelined, so that its value determines the type of the following cycle. nmreq becomes valid during the low phase of the cycle before the one to which it refers. the address from arm610 becomes valid during the high phase of mclk . it is also pipelined, and its value refers to the following memory access. 10.4 memory access there are two types of memory access. these are nonsequential and sequential . the nonsequential cycles occur when a new memory access takes place. a sequential cycle occurs when: the cycle is of the same type as the previous cycle the address is one word (four bytes) greater than the previous access so for example, a single word access consists of a nonsequential access, and a two-word access consists of a nonsequential access followed by a sequential access.
bus interface arm610 data sheet 10-3 nonsequential accesses consist of an idle cycle followed by a memory cycle, and sequential accesses consist simply of a memory cycle. in the case of a nonsequential access, the address is valid throughout the idle cycle, allowing extra time for memory decoding. 10.5 read/write memory accesses may be read or write, differentiated by the signal nrw . this signal has the same timing as the address, so is likewise pipelined, and refers to the following cycle. in the case of a write, the arm610 outputs data on the data bus during the memory cycle. it becomes valid during mclk low, and is held until the end of the cycle. in the case of a read, data is sampled at the end of the memory cycle. nrw may not change during a sequential access, so if a read from address a is followed immediately be a write to address (a+4), the write to address (a+4) will be a nonsequential access. 10.6 byte/word likewise, any memory access may be of a word or a byte quantity. these are differentiated by the signal nbw , which also has the same timing as the address, ie. it becomes valid in the high phase of mclk in the cycle before the one to which it refers. nbw low indicates a byte access. again, nbw may not change during sequential accesses. 10.7 maximum sequential length as explained above, the arm610 will perform sequential memory accesses whenever the cycle is of the same type (ie byte/word, read/write) as the previous cycle, and the addresses are consecutive. however, sequential accesses are interrupted on a 256 word boundary. this is to allow the mmu to check the translation protection as the address crosses a sub-page boundary. if a sequential access is performed over a 256 word boundary, the access to word 256 is simply turned into a nonsequential access, and then further accesses continue sequentially as before. figure 10-1: one word read or write mclk a[31:0] nmreq write d[31:0] read d[31:0]
bus interface arm610 data sheet 10-4 figure 10-2: two word sequential read or write figure 10-3: two word nonsequential unbuffered accesses figure 10-4: two word nonsequential buffered writes mclk a[31:0] nmreq nrw, nbw write d[31:0] a1 a+4 read d[31:0] mclk a[31:0] nmreq nrw, nbw write d[31:0] read d[31:0] a1 a2 mclk a[31:0] nmreq nrw, nbw write d[31:0] read d[31:0] a1 a2
bus interface arm610 data sheet 10-5 10.8 memory access types arm610 performs many different bus accesses, and all are constructed out of combinations of nonsequential and sequential accesses. there may be any number of idle cycles between two other memory accesses. if a memory access is followed by an idle period on the bus (as opposed to another nonsequential access), the address, and the signal nrw and nbw , will remain at their previous value in order to avoid unnecessary bus transitions. the accesses performed by an arm610 are: unbuffered write see 10.8.1 unbuffered writes / uncacheable reads buffered write see 10.8.2 buffered write linefetch see 10.8.3 linefetch level 1 translation fetch see 10.8.4 translation fetches level 2 translation fetch see 10.8.4 translation fetches read-lock-write sequence see 10.8.5 read - lock - write 10.8.1 unbuffered writes / uncacheable reads these are the most basic access types. apart from the difference between read and write, they are the same. each may consist of a single (ldr/str) or multiple (ldm/ stm) access. a multiple access consists of a nonsequential access followed by a sequential access. these cycles always re?ct the type of the instruction requesting the cycle (ie. read/write, byte/word). 10.8.2 buffered write the external bus cycle of a buffered write is identical to and indistinguishable from the bus cycle of an unbuffered write. these cycles always re?ct the type (byte/word) of the instruction requesting the cycle. if several write accesses are stored concurrently within the write buffer, each access on the bus will start with a nonsequential access. 10.8.3 linefetch this appears on the bus as a nonsequential access followed by three sequential accesses. linefetch accesses always start on a quad-word boundary, and are always word accesses. so if the instruction which caused the linefetch was a byte load instruction (ie. ldrb), the linefetch access will be a word access on the bus.
bus interface arm610 data sheet 10-6 figure 10-5: linefetch 10.8.4 translation fetches these accesses are required to obtain the translation data for an access. there are two types: level 1 a level 1 access is required for a section-mapped memory location. level 2 a level 2 access is required for a page mapped memory location. a level 2 access is always preceded by a level 1 access. translation fetches are often immediately followed by a data access. in fact the translation fetch held up the data access because the translation was not contained in the translation lookaside buffer (tlb). translation fetches are always read word accesses. so if a byte or write (or both) access is not possible because the address is not contained in the tlb, the access would be preceded by the translation fetch(es), which will always be word read accesses. figure 10-6: translation table-walking sequence (write) for page mclk a[31:0] nmreq read d[31:0] a a+4 a+8 a+12 level 2 address mclk a[31:0] nmreq nrw write d[31:0] read d[31:0] level 1 address physical address page table entry page table descriptor write data
bus interface arm610 data sheet 10-7 figure 10-7: translation table-walking sequence (write) for section 10.8.5 read - lock - write the read-lock-write sequence is generated by a swp instruction. on the bus it consists of a read access followed by a write access to the same address, and both are treated as nonsequential accesses. the cycle is differentiated by the lock signal. lock has the timing of address, ie. it goes high in the high phase of mclk at the start of the read access. however, it always goes low at the end of the write access, even if the following cycle is an idle cycle (unless of course the following access was a read-lock-write sequence). figure 10-8: read - locked - write mclk a[31:0] nmreq nrw write d[31:0] read d[31:0] level 1 address physical address section descriptor write data mclk a[31:0] nmreq d[31:0] address lock nrw write read
bus interface arm610 data sheet 10-8 figure 10-9: use of nwait pin to stop arm610 for 1 mclk cycle mclk a[31:0] nmreq a a+4 a+8 nwait read d[31:0]
bus interface arm610 data sheet 10-9 10.9 arm610 cycle type summary operation nrw a[31:0] nmreq d[31:0] idle old old i linefetch r a i ram r a+4 m d r a+8 m d r a+12 m d r a+16 m d r a+20 m d r a+24 m d r a+28 m d r a+12 i d start r/w a i r/w a m d uncacheable read / unbuffered write repeat r/w a+n m d end r/w old i start w a i wa m buffered write d more w a+n m d read phase rali ralm rali d write phase wali - unbuff- ered walm read-lock-write w al i d write phase wali - buffered w al m d write phase wali - aborted w al i table 10-1: cycle type summary
bus interface arm610 data sheet 10-10 key to cycle type summary r read ( nrw low) r/w applies equally to read and write w write ( nrw high) old signal remains at previous value a ?st address a+n next sequential address al read-lock-write address i idle cycle ( nmreq high) m memory cycle ( nmreq low) d valid data on data bus each line table 10-1: cycle type summary shows the state of the bus interface during a single mclk cycle. it illustrates the pipelining of nmreq and the address. each operation type section shows the sequence of cycles which make up that type of access, with each line down the diagram showing successive clock cycles. the uncached read / unbuffered write is shown in three sections. the start and end are always present, with the repeat section repeated as many times as required when a multiple access is being performed. buffered writes are also of variable length and consist of the start section plus as many consecutive repeat sections as are necessary. a swap instruction consists of the read phase, followed by one of the three possible write phases. activity on the memory interface is the succession of these access sequences.
arm610 data sheet 11-1 boundary-scan test interface this chapter describes the arm610 boundary-scan test interface. 11.1 introduction 11-2 11.2 overview 11-2 11.3 reset 11-3 11.4 pullup resistors 11-3 11.5 instruction register 11-3 11.6 public instructions 11-3 11.7 test data registers 11-7 11.8 boundary-scan interface signals 11-10 11
boundary-scan test interface arm610 data sheet 11-2 11.1 introduction the boundary-scan interface conforms to ieee std. 1149.1- 1990, standard test access port and boundary-scan architecture . please refer to this for an explanation of the terms used in this section and for a description of the tap controller states. 11.2 overview the boundary-scan interface provides a means of testing the core of the device when it is ?ted to a circuit board, and a means of driving and sampling all the external pins of the device irrespective of the core state. this latter function permits testing of both the device's electrical connections to the circuit board, and (in conjunction with other devices on the circuit board having a similar interface) testing the integrity of the circuit board connections between devices. the interface intercepts all external connections within the device, and each such ?ell is then connected together to form a serial register (the boundary-scan register). the whole interface is controlled via ?e dedicated pins: tdi , tms , tck , ntrst and tdo . figure 11-1: test access port (tap) controller state transitions shows the state transitions that occur in the tap controller. figure 11-1: test access port (tap) controller state transitions select-ir-scan capture-ir tms=0 shift-ir tms=0 exit1-ir tms=1 pause-ir tms=0 exit2-ir tms=1 update-ir tms=1 tms=0 tms=0 tms=1 tms=1 tms=0 select-dr-scan capture-dr tms=0 shift-dr tms=0 exit1-dr tms=1 pause-dr tms=0 exit2-dr tms=1 update-dr tms=1 test-logic reset run-test/idle tms=0 tms=1 tms=0 tms=0 tms=0 tms=1 tms=1 tms=0 tms=1 tms=1 tms=1 tms=1 tms=1 tms=0 tms=0
boundary-scan test interface arm610 data sheet 11-3 11.3 reset the boundary-scan interface includes a state-machine controller (the tap controller). in order to force the tap controller into the correct state after power-up of the device, a reset pulse must be applied to the ntrst pin. if the boundary-scan interface is to be used, ntrst must be driven low, and then high again. if the boundary-scan interface is not to be used, the ntrst pin may be tied permanently low. a clock on tck is not needed to reset the device. the action of reset (either a pulse or a dc level) is as follows: system mode is selected (ie. the boundary-scan chain does not intercept any of the signals passing between the pads and the core). idcode mode is selected. if tck is pulsed, the contents of the id register will be clocked out of tdo . 11.4 pullup resistors the ieee 1149.1 standard effectively requires that tdi , tms , and ntrst should have internal pullup resistors. in order to allow arm610 to consume zero static current, these resistors are not ?ted to this device. accordingly, the four inputs to the test interface (the above three signals plus tck ) must all be driven to good logic levels to achieve normal circuit operation. 11.5 instruction register the instruction register is four bits in length. there is no parity bit. the ?ed value loaded into the instruction register during the capture-ir controller state is 0001. 11.6 public instructions the following public instructions are supported: instruction binary code extest 0000 sample/preload 0011 clamp 0101 highz 0111 clampz 1001 intest 1100 idcode 1110 bypass 1111 table 11-1: public instructions
boundary-scan test interface arm610 data sheet 11-4 in the descriptions that follow, tdi and tms are sampled on the rising edge of tck and all output transitions on tdo occur as a result of the falling edge of tck . 11.6.1 extest (0000) the bs (boundary-scan) register is placed in test mode by the extest instruction. the extest instruction connects the bs register between tdi and tdo . when the instruction register is loaded with the extest instruction, all the boundary- scan cells are placed in their test mode of operation. in the capture-dr state, inputs from the system pins and outputs from the boundary-scan output cells to the system pins are captured by the boundary-scan cells. in the shift-dr state, the previously captured test data is shifted out of the bs register via the tdo pin, while new test data is shifted in via the tdi pin to the bs register parallel input latch. in the update-dr state, the new test data is transferred into the bs register parallel output latch. this data is applied immediately to the system logic and system pins. the ?st extest vector should be clocked into the boundary- scan register, using the sample/preload instruction, prior to selecting intest to ensure that known data is applied to the system logic. 11.6.2 sample/preload (0011) the bs (boundary-scan) register is placed in normal (system) mode by the sample/ preload instruction. the sample/preload instruction connects the bs register between tdi and tdo . when the instruction register is loaded with the sample/preload instruction, all the boundary-scan cells are placed in their normal system mode of operation. in the capture-dr state, a snapshot of the signals at the boundary-scan cells is taken on the rising edge of tck . normal system operation is unaffected. in the shift-dr state, the sampled test data is shifted out of the bs register via the tdo pin, while new data is shifted in via the tdi pin to preload the bs register parallel input latch. in the update-dr state, the preloaded data is transferred into the bs register parallel output latch. this data is not applied to the system logic or system pins while the sample/preload instruction is active. this instruction should be used to preload the boundary-scan register with known data prior to selecting the intest or extest instructions (see the table below for appropriate guard values to be used for each boundary-scan cell). 11.6.3 clamp (0101) the clamp instruction connects a 1?it shift register (the bypass register) between tdi and tdo . when the clamp instruction is loaded into the instruction register, the state of all output signals is de?ed by the values previously loaded into the boundary-scan register. a guarding pattern (speci?d for this device at the end of this section) should be pre-loaded into the boundary-scan register using the sample/preload instruction prior to selecting the clamp instruction.
boundary-scan test interface arm610 data sheet 11-5 in the capture-dr state, a logic 0 is captured by the bypass register. in the shift- dr state, test data is shifted into the bypass register via tdi and out via tdo after a delay of one tck cycle. the ?st bit shifted out will be a zero. the bypass register is not affected in the update-dr state. 11.6.4 highz (0111) the highz instruction connects a 1?it shift register (the bypass register) between tdi and tdo . when the highz instruction is loaded into the instruction register, all outputs are placed in an inactive drive state. in the capture-dr state, a logic 0 is captured by the bypass register. in the shift- dr state, test data is shifted into the bypass register via tdi and out via tdo after a delay of one tck cycle. the ?st bit shifted out will be a zero. the bypass register is not affected in the update-dr state. 11.6.5 clampz (1001) the clampz instruction connects a 1?it shift register (the bypass register) between tdi and tdo . when the clampz instruction is loaded into the instruction register, all outputs are placed in an inactive drive state, but the data supplied to the disabled output drivers is derived from the boundary-scan cells. the purpose of this instruction is to ensure, during production testing, that each output driver can be disabled when its data input is either a 0 or a 1. a guarding pattern (speci?d for this device at the end of this section) should be pre- loaded into the boundary-scan register using the sample/preload instruction prior to selecting the clampz instruction. in the capture-dr state, a logic 0 is captured by the bypass register. in the shift- dr state, test data is shifted into the bypass register via tdi and out via tdo after a delay of one tck cycle. the ?st bit shifted out will be a zero. the bypass register is not affected in the update-dr state. 11.6.6 intest (1100) the bs (boundary-scan) register is placed in test mode by the intest instruction. the intest instruction connects the bs register between tdi and tdo . when the instruction register is loaded with the intest instruction, all the boundary- scan cells are placed in their test mode of operation. in the capture-dr state, the complement of the data supplied to the core logic from input boundary-scan cells is captured, while the true value of the data that is output from the core logic to output boundary- scan cells is captured. capture-dr captures the complemented value of the input cells for testability reasons.
boundary-scan test interface arm610 data sheet 11-6 in the shift-dr state, the previously captured test data is shifted out of the bs register via the tdo pin, while new test data is shifted in via the tdi pin to the bs register parallel input latch. in the update-dr state, the new test data is transferred into the bs register parallel output latch. this data is applied immediately to the system logic and system pins. the ?st intest vector should be clocked into the boundary- scan register, using the sample/preload instruction, prior to selecting intest to ensure that known data is applied to the system logic. single-step operation is possible using the intest instruction. 11.6.7 idcode (1110) the idcode instruction connects the device identi?ation register (or id register) between tdi and tdo . the id register is a 32-bit register that allows the manufacturer, part number and version of a component to be determined through the tap. when the instruction register is loaded with the idcode instruction, all the boundary- scan cells are placed in their normal (system) mode of operation. in the capture-dr state, the device identi?ation code (speci?d at the end of this section) is captured by the id register. in the shift-dr state, the previously captured device identi?ation code is shifted out of the id register via the tdo pin, while data is shifted in via the tdi pin into the id register. in the update-dr state, the id register is unaffected. 11.6.8 bypass (1111) the bypass instruction connects a 1?it shift register (the bypass register) between tdi and tdo . when the bypass instruction is loaded into the instruction register, all the boundary- scan cells are placed in their normal (system) mode of operation. this instruction has no effect on the system pins. in the capture-dr state, a logic 0 is captured by the bypass register. in the shift- dr state, test data is shifted into the bypass register via tdi and out via tdo after a delay of one tck cycle. the ?st bit shifted out will be a zero. the bypass register is not affected in the update-dr state.
boundary-scan test interface arm610 data sheet 11-7 11.7 test data registers figure 11-2: boundary-scan block diagram illustrates the structure of the boundary-scan logic. figure 11-2: boundary-scan block diagram arm core logic instruction register instruction decoder device id register bypass register tap controller ntdoen ntrst tck tms tdi tdo bsoutcell bsoutcell bsincell i/o cell bsoutnencell bsincell bsinencell
b o u n d a r y - s c a n t e s t i n t e r f a c e a r m 6 1 0 d a t a s h e e t 1 1 - 8 1 1 . 7 . 1 b y p a s s r e g i s t e r p u r p o s e : t h i s i s a s i n g l e b i t r e g i s t e r w h i c h c a n b e s e l e c t e d a s t h e p a t h b e t w e e n t d i a n d t d o t o a l l o w t h e d e v i c e t o b e b y p a s s e d d u r i n g b o u n d a r y - s c a n t e s t i n g . l e n g t h : 1 b i t o p e r a t i n g m o d e : w h e n t h e b y p a s s i n s t r u c t i o n i s t h e c u r r e n t i n s t r u c t i o n i n t h e i n s t r u c t i o n r e g i s t e r , s e r i a l d a t a i s t r a n s f e r r e d f r o m t d i t o t d o i n t h e s h i f t - d r s t a t e w i t h a d e l a y o f o n e t c k c y c l e . t h e r e i s n o p a r a l l e l o u t p u t f r o m t h e b y p a s s r e g i s t e r . a l o g i c 0 i s l o a d e d f r o m t h e p a r a l l e l i n p u t o f t h e b y p a s s r e g i s t e r i n t h e c a p t u r e - d r s t a t e . 1 1 . 7 . 2 a r m 6 1 0 d e v i c e i d e n t i c a t i o n ( i d ) c o d e r e g i s t e r p u r p o s e : t h i s r e g i s t e r i s u s e d t o r e a d t h e 3 2 - b i t d e v i c e i d e n t i c a t i o n c o d e . n o p r o g r a m m a b l e s u p p l e m e n t a r y i d e n t i c a t i o n c o d e i s p r o v i d e d . l e n g t h : 3 2 b i t s t h e f o r m a t o f t h e i d r e g i s t e r i s a s f o l l o w s : t h e d e v i c e i d e n t i c a t i o n c o d e f o r t h e z a r l i n k p 6 1 0 a r m - b / k g / f p n r i s 1 e a 1 d 0 6 f . o p e r a t i n g m o d e : w h e n t h e i d c o d e i n s t r u c t i o n i s c u r r e n t , t h e i d r e g i s t e r i s s e l e c t e d a s t h e s e r i a l p a t h b e t w e e n t d i a n d t d o . t h e r e i s n o p a r a l l e l o u t p u t f r o m t h e i d r e g i s t e r . t h e 3 2 - b i t d e v i c e i d e n t i c a t i o n c o d e i s l o a d e d i n t o t h e i d r e g i s t e r f r o m i t s p a r a l l e l i n p u t s d u r i n g t h e c a p t u r e - d r s t a t e . 1 1 . 7 . 3 a r m 6 1 0 b o u n d a r y - s c a n ( b s ) r e g i s t e r p u r p o s e : t h e b s r e g i s t e r c o n s i s t s o f a s e r i a l l y c o n n e c t e d s e t o f c e l l s a r o u n d t h e p e r i p h e r y o f t h e d e v i c e , a t t h e i n t e r f a c e b e t w e e n t h e c o r e l o g i c a n d t h e s y s t e m i n p u t / o u t p u t p a d s . t h i s r e g i s t e r c a n b e u s e d t o i s o l a t e t h e c o r e l o g i c f r o m t h e p i n s a n d t h e n a p p l y t e s t s t o t h e c o r e l o g i c , o r c o n v e r s e l y t o i s o l a t e t h e p i n s f r o m t h e c o r e l o g i c a n d t h e n d r i v e o r m o n i t o r t h e s y s t e m p i n s . o p e r a t i n g m o d e s : t h e b s r e g i s t e r i s s e l e c t e d a s t h e r e g i s t e r t o b e c o n n e c t e d b e t w e e n t d i a n d t d o o n l y d u r i n g t h e s a m p l e / p r e l o a d , e x t e s t a n d i n t e s t i n s t r u c t i o n s . v a l u e s i n t h e b s r e g i s t e r a r e u s e d , b u t a r e n o t c h a n g e d , d u r i n g t h e c l a m p a n d c l a m p z i n s t r u c t i o n s . i n t h e n o r m a l ( s y s t e m ) m o d e o f o p e r a t i o n , s t r a i g h t - t h r o u g h c o n n e c t i o n s b e t w e e n t h e c o r e l o g i c a n d p i n s a r e m a i n t a i n e d a n d n o r m a l s y s t e m o p e r a t i o n i s u n a f f e c t e d . 0 1 1 1 1 2 2 7 2 8 3 1 1 m a n u f a c t u r e r i d e n t i t y p a r t n u m b e r v e r s i o n
boundary-scan test interface arm610 data sheet 11-9 in test mode (ie. when either extest or intest is the currently selected instruction), values can be applied to the core logic or output pins independently of the actual values on the input pins and core logic outputs respectively. on the arm610, all of the boundary-scan cells include an update register and thus all of the pins can be controlled in the above manner. additional boundary-scan cells are interposed in the scan chain in order to control the enabling of tristateable buses. the correspondence between boundary-scan cells and system pins, system direction controls and system output enables is as shown in table 11-3: boundary-scan signals and pins on page 11-12. the cells are listed in the order in which they are connected in the boundary-scan register, starting with the cell closest to tdi . all boundary-scan register cells at input pins can apply tests to the on-chip core logic. the extest guard values speci?d in table 11-3: boundary-scan signals and pins on page 11-12 should be clocked into the boundary-scan register (using the sample/ preload instruction) before the extest instruction is selected, to ensure that known data is applied to the core logic during the test. the intest guard values shown in the table below should be clocked into the boundary-scan register (using the sample/preload instruction) before the intest instruction is selected to ensure that all outputs are disabled. these guard values should also be used when new extest or intest vectors are clocked into the boundary-scan register. the values stored in the bs register after power-up are not detned. similarly, the values previously clocked into the bs register are not guaranteed to be maintained across a boundary-scan reset (from forcing ntrst low or entering the test logic reset state). 11.7.4 output enable boundary-scan cells the boundary-scan register cells nendout, nabe, ntbe, and nmse control the output drivers of tristate outputs as shown in the table below. in the case of outen0 enable cells (nendout, ntbe), loading a 1 into the cell will place the associated drivers into the tristate state, while in the case of type inen1 enable cells (nabe, nmse), loading a 0 into the cell will tristate the associated drivers. to put all arm610 tristate outputs into their high impedance state, a logic 1 should be clocked into the output enable boundary-scan cells nendout and ntbe, and a logic 0 should be clocked into nabe and nmse. alternatively, the highz instruction can be used. if the on-chip core logic causes the drivers controlled by nendout, for example, to be tristate, (ie. by driving the signal nendout high), then a 1 will be observed on this cell if the sample/preload or intest instructions are active. 11.7.5 single-step operation arm610 is a static design and there is no minimum clock speed. it can therefore be single-stepped while the intest instruction is selected. this can be achieved by serialising a parallel stimulus and clocking the resulting serial vectors into the boundary-scan register. when the boundary-scan register is updated, new test stimuli are applied to the core logic inputs; the effect of these stimuli can then be observed on the core logic outputs by capturing them in the boundary-scan register.
boundary-scan test interface arm610 data sheet 11-10 11.8 boundary-scan interface signals figure 11-3: boundary-scan timing tbscl tbsch tck tms,tdi tdo tbsis tdo tdo data i/o data out data out data out tbsrs tms ntrst tbsih tbsoh tbsod tbsoe tbsss tbssh tbsdh tbsdd tbsoz tbsde tbsdz tbsr tbsrh
boundary-scan test interface arm610 data sheet 11-11 notes 1 tck may be stopped indefinitely in either the low or high phase. 2 assumes a 25pf load on tdo. output timing derates at 0.072ns/pf of extra load applied. 3 tdo enable time applies when the tap controller enters the shift-dr or shift- ir states. 4 tdo disable time applies when the tap controller leaves the shift-dr or shift- ir states. 5 for correct data latching, the i/o signals (from the core and the pads) must be setup and held with respect to the rising edge of tck in the capture-dr state of the sample/preload, intest and extest instructions. 6 assumes that the data outputs are loaded with the ac test loads (see ac parameter speci?ation). symbol parameter min typ max units notes tbscl tck low period 50 ns 1 tbsch tck high p eriod 50 ns 1 tbsis tdi,tms setup to [tcr] 10 ns tbsih tdi,tms hold from [tcr] 10 ns tbsod tcf to tdo valid 30 ns 2 tbsoh tdo hold time 5 ns 2 tbsoe tdo enable time 5 ns 2,3 tbsoz tdo disable time 12.5 ns 2,4 tbsss i/o signal setup to [tcr] 5 ns 5 tbssh i/o signal hold from [tcr] 20 ns 5 tbsdd tcf to data output valid 30 ns tbsdh data output hold time 5 ns 6 tbsde data output enable time 5 ns 6,7 tbsdz data output disable time 16.5 ns 6,8 tbsr reset period 30 ns tbsrs tms setup to [trr] 10 ns 9 tbsrh tms hold from [trr] 10 ns 9 table 11-2: arm610 boundary-scan interface timing
boundary-scan test interface arm610 data sheet 11-12 7 data output enable time applies when the boundary-scan logic is used to enable the output drivers. 8 data output disable time applies when the boundary-scan is used to disable the output drivers. 9 tms must be held high as ntrst is taken high at the end of the boundary- scan reset sequence. key in input pad out output pad nen1 input enable active high outeno output enable active low * for intest extest/clamp no. cell name pin type output enable bs cell guard value in .......... ex 1 din23 d[23] in - 2 dout23 d[23] out nendout 3 din22 d[22] in - 4 dout22 d[22] out nendout 5 din21 d[21] in - 6 dout21 d[21] out nendout 7 din20 d[20] in - 8 dout20 d[20] out nendout 9 din19 d[19] in - 10 dout19 d[19] out nendout 11 din18 d[18] in - 12 dout18 d[18] out nendout 13 din17 d[17] in - 14 dout17 d[17] out nendout 15 din16 d[16] in - 16 dout16 d[16] out nendout 17 din15 d[15] in - 18 dout15 d[15] out nendout 19 din14 d[14] in - table 11-3: boundary-scan signals and pins
boundary-scan test interface arm610 data sheet 11-13 20 dout14 d[14] out nendout 21 din13 d[13] in - 22 dout13 d[13] out nendout 23 din12 d[12] in - 24 dout12 d[12] out nendout 25 din11 d[11] in - 26 dout11 d[11] out nendout 27 din10 d[10] in - 28 dout10 d[10] out nendout 29 din9 d[9] in - 30 dout9 d[9] out nendout 31 nendout - outen0 - 1 32 din8 d[8] in - 33 dout8 d[8] out nendout 34 din7 d[7] in - 34 din7 d[7] in - 35 dout7 d[7] out nendout 36 din6 d[6] in - 37 dout6 d[6] out nendout 38 din5 d[5] in - 39 dout5 d[5] out nendout 40 din4 d[4] in - 41 dout4 d[4] out nendout 42 din3 d[3} in - 43 dout3 d[3] out nendout 44 din2 d[2] in - 45 dout2 d[2] out nendout 46 din1 d[1] in - no. cell name pin type output enable bs cell guard value in .......... ex table 11-3: boundary-scan signals and pins (continued)
boundary-scan test interface arm610 data sheet 11-14 47 dout1 d[1] out nendout 48 din0 d[0] in - 49 dout0 d[0] out nendout 50 dbe dbe in - 51 seq seq out nmse 52 nmreq nmreq out nmse 53 nmse mse inen1 - 0 54 sna sna in - 55 nwait nwait in - 56 mclk mclk in - 0 57 fclk fclk in - 0 58 abort abort in - 59 nreset nreset in - 60 testin[16] testin[16] in - 0 61 testout[2] testout[2] out ntbe 62 testout[1] testout[1] out ntbe 63 testout[0] testout[0] out ntbe 64 nirq nirq in - 65 nfiq nfiq in - 66 testin[0] testin[0] in - 0 67 testin[1] testin[1] in - 0 68 testin[2] testin[2] in - 0 69 testin[3] testin[3] in - 0 70 testin[4] testin[4] in - 0 71 testin[5] testin[5] in - 0 72 testin[6] testin[6] in - 0 73 testin[7] testin[7] in - 0 74 ntbe - outen0 - 1 no. cell name pin type output enable bs cell guard value in .......... ex table 11-3: boundary-scan signals and pins (continued)
boundary-scan test interface arm610 data sheet 11-15 75 ale ale in - 76 a31 a[31] out nabe 77 a30 a[30] out nabe 78 a29 a[29] out nabe 79 a28 a[28] out nabe 80 a27 a[27] out nabe 81 a26 a[26] out nabe 82 a25 a[25] out nabe 83 a24 a[24] out nabe 84 a23 a[23] out nabe 85 a22 a[22] out nabe 86 a21 a[21] out nabe 87 a20 a[20] out nabe 88 a19 a[19] out nabe 89 a18 a[18] out nabe 90 a17 a[17] out nabe 91 a16 a[16] out nabe 92 a15 a[15] out nabe 93 a14 a[14] out nabe 94 a13 a[13] out nabe 95 a12 a[12] out nabe 96 a11 a[11] out nabe 97 a10 a[10] out nabe 98 a09 a[09] out nabe 99 a08 a[08] out nabe 100 a07 a[07] out nabe 101 a06 a[06] out nabe 102 a05 a[05] out nabe no. cell name pin type output enable bs cell guard value in .......... ex table 11-3: boundary-scan signals and pins (continued)
boundary-scan test interface arm610 data sheet 11-16 103 a04 a[04] out nabe 104 a03 a[03] out nabe 105 a02 a[02] out nabe 106 a01 a[01] out nabe 107 a00 a[00] out nabe 108 nabe abe inen1 - 0 109 rlw lock out nabe 110 nbw nbw out nabe 111 nrw nrw out nabe 112 testin[15] testin[15] in - 0 113 testin[14] testin[14] in - 0 114 testin[13] testin[13] in - 0 115 testin[12] testin[12] in - 0 116 testin[11] testin[11] in - 0 117 testin[10] testin[10] in - 0 118 testin[09] testin[09] in - 0 119 testin[08] testin[08] in - 0 120 din31 d[31] in - 121 dout31 d[31] out nendout 122 din30 d[30] in - 123 dout30 d[30] out nendout 124 din29 d[29] in - 125 dout29 d[29] out nendout 126 din28 d[28] in - 127 dout28 d[28] out nendout 128 din27 d[27] in - 129 dout27 d[27] out nendout 130 din26 d[26] in - no. cell name pin type output enable bs cell guard value in .......... ex table 11-3: boundary-scan signals and pins (continued)
boundary-scan test interface arm610 data sheet 11-17 131 dout26 d[26] out nendout 132 din25 d[25] in - 133 dout25 d[25] out nendout 134 din24 d[24] in - 135 dout24 d[24] out nendout no. cell name pin type output enable bs cell guard value in .......... ex table 11-3: boundary-scan signals and pins (continued)
boundary-scan test interface arm610 data sheet 11-18
arm610 data sheet 12-1 dc parameters this chapter describes the arm610 dc parameters. 12.1 absolute maximum ratings 12-2 12.2 dc operating conditions 12-2 12.3 dc characteristics 12-3 12
dc parameters arm610 data sheet 12-2 12.1 absolute maximum ratings note these are stress ratings only. exceeding the absolute maximum ratings may permanently damage the device. operating the device at absolute maximum ratings for extended periods may affect device reliability. 12.2 dc operating conditions notes 1 voltages measured with respect to vss. 2 it - ttl-level inputs (includes it and itotz pin types) 3 ocz - output, cmos levels, tri-stateable symbol parameter min max units notes vdd supply voltage vss-0.3 vss+7.0 v 1 vip voltage applied to any pin vss-0.3 vdd+0.3 v 1 ts storage temperature -40 125 deg. c 1 table 12-1: arm610 dc maximum ratings symbol parameter min max units notes vdd supply voltage 4.5 5.5 v viht it input high voltage 2.4 vdd v vilt it input low voltage 0.0 0.8 v vohc ocz output high voltage 3.5 vdd v volc ocz output low voltage 0.0 0.4 v ta ambient operating temperature -10 70 deg.c table 12-2: arm610 dc operating conditions
dc parameters arm610 data sheet 12-3 12.3 dc characteristics symbol parameter min max units idd static supply current 30 m a isc output short circuit current 100 ma ilu dc latch-up current 100 ma iin it input leakage current +/- 10 m a cin input capacitance 5(typ) pf esd hmb model esd 2 kv table 12-3: arm610 dc characteristics
dc parameters arm610 data sheet 12-4
arm610 data sheet 13-1 ac parameters this chapter describes the arm610 ac parameters. 13.1 test conditions 13-2 13.2 relationship between fclk and mclk 13-2 13.3 main bus signals 13-3 13
ac parameters arm610 data sheet 13-2 13.1 test conditions the ac timing diagrams presented in this section assume that the outputs of arm610 have been loaded with the capacitive loads shown in the test load column of the table below; these loads have been chosen as typical of the system in which arm610 might be employed. the output pads of arm610 are cmos drivers which exhibit a propagation delay that increases linearly with the increase in load capacitance. an output derating ?ure is given for each output pad, showing the approximate rate of increase of output time with increasing load capacitance. 13.2 relationship between fclk and mclk figure 13-1: clock timing relationship output signal test load (pf) output derating (ns/pf) a[25:0] 50 0.072 d[31:0] 50 0.072 nr/w 50 0.072 nb/w 50 0.072 lock 50 0.072 nmreq 50 0.072 seq 50 0.072 table 13-1: arm610 ac test conditions tfckl tfckh fclk mclk tfmh twh
ac parameters arm610 data sheet 13-3 note fclk timings measured at 50% of vdd. 13.2.1 disable times disable times in this data sheet are speci?d in the following manner: figure 13-2: disable times specification 13.2.2 tald measurement tald is the maximum delay allowed in the ale input transition to guarantee the address will not change: figure 13-3: tald measurement 13.3 main bus signals symbol parameter min max unit note tfckl fclk low time 15 ns 1 tfckh fclk high time 15 ns 1 tfmh fclk - mclk hold time 18 ns tmfs mclk - fclk setup 3 ns table 13-2: arm610 fclk and mclk relationship driver input disable time 13ns 2.6v output goes hi z mclk ale a[31:0] tald
ac parameters arm610 data sheet 13-4 figure 13-4: arm610 main bus timing tmckl tmckh mclk nwait taddr tws twh abe a[31:0] tabe tah tabz dbe tdbe tdbz d[31:0] out d[31:0] in tdoh tdih tde tdout tdz tdis abort tabts tabth1 tabth2 mse nmreq tmsh seq tmsd tmsz tmse nbw lock nrw ale tale
ac parameters arm610 data sheet 13-5 symbol parameter min max unit note tmckl mclk low time 26 ns 1 tmckh mclk high time 26 ns tws nwait setup to mclk 2 ns twh nwait hold from mclk 2 ns tale address latch enable 12 ns 5 tabe address bus enable 2 9 ns 2 tabz address bus disable 20 ns 4 taddr mclk to address delay 18 ns 2 tah address hold time 4 ns 2 tdbe dbe to data enable 4 12 ns 2 tde mclk to data enable 7 ns 2 tdbz dbe to data disable 16 22 ns 4 tdz mclk to data disable 25 ns 4 tdout data out delay 27 ns 2 tdoh data out hold 4 ns 2 tdis data in setup 1 ns tdih data in hold 7 ns tabts abort setup time 4 ns tabth1 abort hold time 2 ns 3 tabth2 abort hold time 2 ns 3 tmse nmreq & seq enable 6 ns tmsz nmreq & seq disable 21 ns 4 tmsd nmreq & seq delay 30 ns tmsh nmreq & seq hold 4 ns table 13-3: arm610 fclk and mclk relationship
ac parameters arm610 data sheet 13-6 note 1 mclk timings measured between clock edges at 50% of vdd. 2 the timings of these buses are measured to ttl levels. 3 tabth1 is a requirement for arm610. to ensure compatibility with future processors, designs should meet tabth2. tabth2 is not tested on arm610. 4 see figure 13-2: disable times specification on page 13-3. 5 see 13.2.2 tald measurement on page 13-3.
arm610 data sheet 14-1 physical details this chapter gives a detailed physical description of the arm610. 14.1 physical details 14-2 14
physical details arm610 data sheet 14-2 14.1 physical details figure 14-1: arm610 144 pin tqfp mechanical dimensions in mm view from above P610ARM-B pin 36 pin 1 pin 37 pin 72 pin 73 pin 108 pin 109 pin 144 22.00 20.00 0.5 typ 0.22 1.40 1.60 max 22.00 20.00
arm610 data sheet 15-1 pinout this chapter gives the arm610 pinout details. 15.1 pinout 15-2 15
pinout arm610 data sheet 15-2 15.1 pinout pin signal type 1 mse i 2 nmreq o 3 seq o 4 dbe i 5 vss2 - 6 vdd2 - 7 d[ 0] i/o 8 d[ 1] i/o 9 d[ 2] i/o 10 d[ 3] i/o 11 d[ 4] i/o 12 d[ 5] i/o 13 d[ 6] i/o 14 d[ 7] i/o 15 d[ 8] i/o 16 vss2 - 17 vdd2 - 18 vss1 - 19 vdd1 - 20 d[ 9] i/o 21 d[10] i/o 22 d[11] i/o 23 d[12] i/o 24 d[13] i/o 25 d[14] i/o 26 d[15] i/o 27 d[16] i/o 28 d[17] i/o table 15-1: arm610 in 144 pin thin quad ?t pack
pinout arm610 data sheet 15-3 29 d[18] i/o 30 d[19] i/o 31 vdd2 - 32 vss2 - 33 d[20] i/o 34 d[21] i/o 35 d[22] i/o 36 d[23] i/o 37 d[24] i/o 38 d[25] i/o 39 d[26] i/o 40 vss1 - 41 vss2 - 42 vdd2 - 43 d[27] i/o 44 d[28] i/o 45 d[29] i/o 46 d[30] i/o 47 d[31] i/o 48 tdo o 49 tdi i 50 ntrst i 51 vdd1 - 52 tms i 53 tck i 54 n/c - 55 n/c - 56 n/c - 57 n/c - pin signal type table 15-1: arm610 in 144 pin thin quad ?t pack (continued)
pinout arm610 data sheet 15-4 58 n/c - 59 testin[ 8] i 60 testin[ 9] i 61 vdd1 - 62 vss1 - 63 testin[10] i 64 testin[11] i 65 testin[12] i 66 testin[13] i 67 testin[14] i 68 testin[15] i 69 vss2 - 70 vdd2 - 71 nr/w o 72 nb/w o 73 lock o 74 abe i 75 a[ 0] o 76 a[ 1] o 77 a[ 2] o 78 vss2 - 79 vdd2 - 80 a[ 3] o 81 a[ 4] o 82 a[ 5] o 83 a[ 6] o 84 a[ 7] o 85 a[ 8] o 86 a[ 9] o pin signal type table 15-1: arm610 in 144 pin thin quad ?t pack (continued)
pinout arm610 data sheet 15-5 87 a[10] o 88 a[11] o 89 a[12] o 90 vdd2 - 91 vss1 - 92 vdd1 - 93 vss2 - 94 a[13] o 95 a[14] o 96 a[15] o 97 a[16] o 98 a[17] o 99 a[18] o 100 a[19] 101 a[20] o 102 vdd2 - 103 vss2 - 104 a[21] o 105 a[22] o 106 a[23] o 107 a[24] o 108 a[25] o 109 a[26] o 110 a[27] o 111 a[28] o 112 vdd2 - 133 vss2 - 114 a[29] o 115 a[30] o pin signal type table 15-1: arm610 in 144 pin thin quad ?t pack (continued)
pinout arm610 data sheet 15-6 116 a[31] o 117 ale i 118 n/c 119 n/c 120 n/c 121 vss1 - 122 vdd1 - 123 testin[ 7] i 124 testin[ 6] i 125 testin[ 5] i 126 testin[ 4] i 127 testin[ 3] i 128 testin[ 2] i 129 testin[ 1] i 130 testin[ 0] i 131 nfiq 132 nirq 133 testout[0] o 134 testout[1] o 135 testout[2] o 136 testin[16] i 137 nreset i 138 abort i 139 fclk i 140 mclk i 141 vdd2 - 142 vss2 - 143 nwait i 144 sna i pin signal type table 15-1: arm610 in 144 pin thin quad ?t pack (continued)
arm610 data sheet a-1 backward compatibility this chapter gives an overview of arm6 backward compatibility. a.1 backward compatibility a-2 a
backward compatibility arm610 data sheet a-2 a.1 backward compatibility two of the control register bits, prog32 and data32, allow one of three processor con?urations to be selected as follows: 1 26?it program and data space ?prog32 low, data32 low). this con?uration forces arm610 to operate like the earlier arm processors with 26-bit address space. the programmer's model for these processors applies, but the new instructions to access the cpsr and spsr registers operate as detailed elsewhere in this document. in this con?uration it is impossible to select a 32-bit operating mode, and all exceptions (including address exceptions) enter the exception handler in the appropriate 26-bit mode. 2 26?it program space and 32?it data space ?prog32 low, data32 high). this is the same as the 26-bit program and data space con?uration, but with address exceptions disabled to allow data transfer operations to access the full 32-bit address space. 3 32?it program and data space ?prog32 high, data32 high). this con?uration extends the address space to 32 bits, introduces major changes in the programmer's model as described below and provides support for running existing 26-bit programs in the 32-bit environment. the fourth processor con?uration which is possible (26-bit data space and 32-bit program space) should not be selected. when con?ured for 26?it program space, arm8 is limited to operating in one of four modes known as the 26?it modes. these modes correspond to the modes of the earlier arm processors and are known as: user26 fiq26 irq26 and supervisor26. these are the normal operating modes in this con?uration and the 26-bit modes are only provided for backwards compatibility to allow execution of programs originally written for earlier arm processors.
www.zarlink.com information relating to products and services furnished herein by zarlink semiconductor inc. or its subsidiaries (collectively ?zarlink?) is believed to be reliable. however, zarlink assumes no liability for errors that may appear in this publication, or for liability otherwise arising from t he application or use of any such information, product or service or for any infringement of patents or other intellectual property rights owned by third parties which may result from such application or use. neither the supply of such information or purchase of product or service conveys any license, either express or implied, u nder patents or other intellectual property rights owned by zarlink or licensed from third parties by zarlink, whatsoever. purchasers of products are also hereby notified that the use of product in certain ways or in combination with zarlink, or non-zarlink furnished goods or services may infringe patents or other intellect ual property rights owned by zarlink. this publication is issued to provide information only and (unless agreed by zarlink in writing) may not be used, applied or re produced for any purpose nor form part of any order or contract nor to be regarded as a representation relating to the products or services concerned. the products, t heir specifications, services and other information appearing in this publication are subject to change by zarlink without notice. no warranty or guarantee express or implied is made regarding the capability, performance or suitability of any product or service. information concerning possible methods of use is provided as a guide only and does not constitute any guarantee that such methods of use will be satisfactory in a specific piece of equipment. it is the user?s responsibility t o fully determine the performance and suitability of any equipment using such information and to ensure that any publication or data used is up to date and has not b een superseded. manufacturing does not necessarily include testing of all functions or parameters. these products are not suitable for use in any medical products whose failure to perform may result in significant injury or death to the user. all products and materials are sold and services provided subject to zarlink?s conditi ons of sale which are available on request. purchase of zarlink?s i 2 c components conveys a licence under the philips i 2 c patent rights to use these components in and i 2 c system, provided that the system conforms to the i 2 c standard specification as defined by philips. zarlink, zl and the zarlink semiconductor logo are trademarks of zarlink semiconductor inc. copyright zarlink semiconductor inc. all rights reserved. technical documentation - not for resale for more information about all zarlink products visit our web site at


▲Up To Search▲   

 
Price & Availability of P610ARM-B

All Rights Reserved © IC-ON-LINE 2003 - 2022  

[Add Bookmark] [Contact Us] [Link exchange] [Privacy policy]
Mirror Sites :  [www.datasheet.hk]   [www.maxim4u.com]  [www.ic-on-line.cn] [www.ic-on-line.com] [www.ic-on-line.net] [www.alldatasheet.com.cn] [www.gdcy.com]  [www.gdcy.net]


 . . . . .
  We use cookies to deliver the best possible web experience and assist with our advertising efforts. By continuing to use this site, you consent to the use of cookies. For more information on cookies, please take a look at our Privacy Policy. X